Hi, in GNUmed (wiki.gnumed.de) we use schema hashing to detect whether a database can safely be upgraded or used by a client. The general procedure is this: - generate a line-by-line representation of the database objects in the format "schema.table.column::data type" from the information catalog - sort those lines by their md5() hash (to avoid locale related sorting issues - or so we thought) - generate an md5() hash over the concatenation and compare that to known hashes This has worked nicely so far. However, recently a Norwegian user found that his setup sorts those md5() line hashes differently from other setups. Be his setup broken or not we'd like to avoid this issue in the future. What immediately comes to mind is to convert the md5() hex string to integer and sort by that (numeric sorting should, by all means, be universally acceptable) but, alas, it's to large even for bigint. So, I was thinking to extract parts of the string, convert those into ints and re-concatenate those ints into a string and sort by that - sorting digits-only strings should be pretty safe universally, too. However, I am not entirely sure whether I'd be running a higher risk of collisions that way. (Much simplified) example: md5 = x'fe' (I know an md5 can never really yield that value) part 1 = x'f' = 15 part 2 = x'e' = 14 string to sort by = '1514' (while in reality 'fe' = 254 => '254') I *think* I should be safe but would like to hear another opinion. --- I just realized this is not an SQL or even PostgreSQL related question at all so to justify my post: Should I be going about this sorting or hashing or detection business in another way entirely which can be done at the SQL level ? ;-) Thanks, Karsten -- GPG key ID E4071346 @ wwwkeys.pgp.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346 ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings