Hi!I had found (since only 2 hours) that KixForm can to do used by Python +PyWin32At the end of the ...
By doremichellasido
Hi!I had found (since only 2 hours) that KixForm can to do used by Python +PyWin32At the end of the ...
By do_re_mi_chel_la_si_do
Mike Kent <mrmakent@cox.net> writes:> A bug issue has been opened in the Python Trac system...
By ben_finney
"Ben Finney" <bignose+hates-spam@benfinney.id.au> wrote in messagenews:8763xi7wme.fsf_-_@benfi...
By terry_reedy
On Jan 24, 5:13 pm, "Terry Reedy" <tjre...@udel.edu> wrote:> "Ben Finney" <bignose+hates...
By mike_kent
Hi,I'd like to create a temporaty shell script. Therefore I use tempfilewith NamedTemporaryFile. The...
By christophe_delarue_gmail_com
Hi,I'd like to create a temporaty shell script. Therefore I use tempfilewith NamedTemporaryFile...
By christophe_delarue_gmail
I am very new to both programming and Pyhton and while trying to dosome practice using A byte of pyt...
By willkab6_gmail, 3 Comments
Hi,I have a program that makes a call to a function in a different pythonscript that I wrote. But, w...
By seancron, 1 Comments
Is anyone using the Kiwi wrappers for pygtk? Is there an updatedversion of it? The one I can find is...
By laughlin_josephv, 1 Comments
Hi guys,probably dumb question but after googling a lot I couldn't find an answer.with a simple...
By gmtaglia, 2 Comments
Greetings.Im trying to write a program that can be run from the command line.If I want to search for...
By peter_hansen, 6 Comments
Is anyone using the Kiwi wrappers for pygtk? Is there an updatedversion of it? The one I can find is...
By laughlin_joseph_v, 1 Comments
Greetings.Im trying to write a program that can be run from the command line.If I want to search for...
By peterhansen, 6 Comments
Hi,Is there a module for parsing spec files available?Marian----Best Regards,Marian Jancarsoftware d...
By marianjancar, 4 Comments
John Marshall wrote:
> For strings of > 1 character, what are the chances
> that hash(st) and hash(st[::-1]) would return the
> same value?
Why not grab a dictionary and do the stats yourself?
--Scott David Daniels
scott.daniels...acm.org
scottdavid_daniels | Wed, 07 May 2008 00:27:00 GMT |
Scott David Daniels wrote:
> John Marshall wrote:
>
>
> Why not grab a dictionary and do the stats yourself?
I was actually interested in the mathematical/probability
side rather than the empirical w/r to the current
hash function in python. Although I imagine I could do
a brute force test for x-character strings.
John
john_marshall | Wed, 07 May 2008 00:28:00 GMT |
John Marshall wrote:
> I was actually interested in the mathematical/probability
> side rather than the empirical w/r to the current
> hash function in python. Although I imagine I could do
> a brute force test for x-character strings.
Hah. No.
At least on the version I have handy (Py 2.2.3 on Itanium2), hash
returns a 64-bit value. Brute-forcing that in any reasonable length of
time is rather impossible.
christopher_subich | Wed, 07 May 2008 00:29:00 GMT |
John Marshall wrote:
> Scott David Daniels wrote:
> I was actually interested in the mathematical/probability
> side rather than the empirical w/r to the current
> hash function in python.
Well, the probability depends on the universe you are choosing from.
That was why I was suggesting a dictionary: words may well have a
different distribution than arbitrary strings.
--Scott David Daniels
scott.daniels...acm.org
scottdavid_daniels | Wed, 07 May 2008 00:30:00 GMT |
John Marshall wrote:
> For strings of > 1 character, what are the chances
> that hash(st) and hash(st[::-1]) would return the
> same value?
the algorithm is described here:
http://effbot.org/zone/python-hash.htm
feel free to do a mathematical analysis. a non-mathematical analysis
says that if the chance is high if your word is a palindrome, and low if
it's not.
</F>
fredrik_lundh | Wed, 07 May 2008 00:31:00 GMT |
[John Marshall]
> For strings of > 1 character, what are the chances
> that hash(st) and hash(st[::-1]) would return the
> same value?
Python's string hash algorithm is non-commutative, so a collision with
a reversed string is not likely. The exact answer depends on the
population of strings being hashed, but it's not hard to compute
collision statistics for a sampling of those strings:
collisions = len(sample) - len(set(hash(s) for s in sample))
FWIW, here is how Python computes string hash values:
static long
string_hash(PyStringObject *a)
{
register int len;
register unsigned char *p;
register long x;
len = a->ob_size;
p = (unsigned char *) a->ob_sval;
x = *p << 7;
while (--len >= 0)
x = (1000003*x) ^ *p++;
x ^= a->ob_size;
if (x == -1)
x = -2;
return x;
}
> My goal is to uniquely identify multicharacter strings,
> all of which begin with "/" and never end with "/".
> Therefore, st != st[::-1].
Just use a set -- no string reversal is needed for detection of unique
multicharacter strings..
Raymond
raymond_hettinger | Wed, 07 May 2008 00:32:00 GMT |
[John Marshall]
> For strings of > 1 character, what are the chances
> that hash(st) and hash(st[::-1]) would return the
> same value?
First, if `st` is a string, `st[::-1]` is a list. Do you really mean
to compare string hashes with list hashes here? I'm going to assume
not.
Second, what are your assumptions about (a) the universe of strings;
and, (b) the hash function?
Assuming a finite universe of strings (also finite ;-)), and a hash
function that returns each of its H possible results "at random"
(meaning that there's no algorithmic way to predict any bit of the
hash output short of running the hash function), then the probability
that two distinct strings have the same hash is 1/H. It doesn't
matter to this outcome whether one input is the reversal of the other.
> My goal is to uniquely identify multicharacter strings,
Unclear what that means. Obviously, if your string universe contains
more than H strings, it's impossible for any hash function with H
possible values to return a different hash value each input.
> all of which begin with "/" and never end with "/".
> Therefore, st !=3D st[::-1].
As at the start, I think you mean to say st !=3D "".join(st[::-1]). I
don't know why you might think that matters, though. Is it simply
because this condition eliminates palindromes from your input
universe?
Anyway, to be concrete, using CPython's hash function on a 32-bit box,
H =3D 2**32-1. Call a string `s` bad iff:
s[0] =3D=3D "/" and s[-1] !=3D "/" and hash(s) =3D=3D hash("".join(reve=
rsed(s)))
Then there are no bad strings of length 1, 2, 3, or 4. There are 4
bad strings of length 5:
'/\xde&\xf6C'
'/\xca\x0e\xfaC'
'/\xc4\x06\xfcC'
'/\xad\xd6\x01\xd6'
I didn't think about this -- I just wrote a little program to try all
~=3D 4 billion such strings. So if your universe is the set of all
5-character 8-bit strings that start with '/' but don't end with '/',
and you pick inputs uniformly at random from that universe, the chance
of a hash match between a string and its reversal is
4 / (256**3 * 255)
or a little less than 1 in a billion. For a truly random hash
function with a 32-bit output, the chance would be a little less than
1 in 4 billion.
It would be mildly surprising if those odds got worse as string length
increased. The md5 and sha hashes have much larger H, and were
designed for (among other things) good collision resistance.
timpeters | Wed, 07 May 2008 00:33:00 GMT |
[John Marshall]
[Tim Peters]
[Jeff Epler]
> It is?
>
> 'sgorf hcnerf'
> (Python 2.3)
Indeed that's right. Python 2.4+ also. My apologies! Good thing it
doesn't matter to the rest of the exposition ;-)
timpeters | Wed, 07 May 2008 00:34:00 GMT |
Tim Peters wrote:
> First, if `st` is a string, `st[::-1]` is a list.
I hate to question the great timbot, but am I missing something?
'edcba'
STeVe
steven_bethard | Wed, 07 May 2008 00:35:00 GMT |