Thank you, I would! What versions have you tested the patch against? I am sorry but I am not too familiar with applying patches against the main program, is there documentation on how to apply the patch? Is there a way to roll the patch back/remove it? Would I be able to script the installation of the patch (I would expect so). The reason for the last question is that I expect (hope) many people will use it and I want to make it as easy as possible for a user to simply select or unselect the patch if it works well. If I can script the install and removal of this patch then I can do just this and that would be wonderful. Thank you again! どうも ありがとう ございます! (I hope that is right, my Japanese is still elementary. :) ) Madison Tatsuo Ishii wrote: > We have developed patches which relaxes the character validation so > that PostgreSQL accepts invalid characters. It works like this: > > 1) new postgresql.conf item "mbstr_check" added. > 2) if mbstr_check = 0 then invalid characters are not accepted > (same as current PostgreSQL behavior). This is the default. > 3) if mbstr_check = 1 then invalid characters are accepted with > WARNING > 4) if mbstr_check = 2 then invalid characters are accepted without any > warnings > 5) We have checked PostgreSQL source code if accepting invalid > characters makes some troubles. We have found that we need to fix a > place and the fix is included in the patches. > > Madison, > If you are interested in the patches, I could send it to you. > > Hackers, > Do you think the functionality something like above is worth to add to > PostgreSQL? > -- > Tatsuo Ishii > > >>Hi all, >> >> I've been chasing down a bug and from what I have learned it may be >>because of how postgreSQL (8.0.2 on Fedora Core 4 test 2) handles >>invalid unicode. I've been given some ideas on how to try to catch >>invalid unicode but it seems expensive so I am hoping there is a >>postgresql way to deal with this problem. >> >> I've run into a problem where a bulk postgres "COPY..." statement is >>dieing because one of the lines contains a file name with an invalid >>unicode character. In nautilus this file has '(invalid encoding)' and >>the postgres error is 'CONTEXT: COPY file_info_3, line 228287, column >>file_name: "Femme Fatal\uffff.url"'. >> >> To actually look at the file from the shell (bash) shows what appears >>to be a whitespace but when I copy/paste the file name I get the >>'\uffff' you see above. >> >> I could, with the help of the TLUG people, use regex to match for an >>invalid character and skip the file but that is not ideal. The reason is >>that this is for my backup program and invalid unicode or not, the >>contents of the file may still be important and I would prefer to have >>it in the database so that it is later copied. I can copy and move the >>file in the shell so the file isn't apparently in an of itself corrupt. >> >> So then, is there a way I can tell postresql to accept the invalid >>unicode name? Here is a copy of my schema: >> >>tle-bu=> \d file_info_2 >> Table "public.file_info_2" >> Column | Type | Modifiers >>----------------------+----------------------+----------------------------------------- >> file_group_name | text | >> file_group_uid | bigint | not null >> file_mod_time | bigint | not null >> file_name | text | not null >> file_parent_dir | text | not null >> file_perm | text | not null >> file_size | bigint | not null >> file_type | character varying(2) | not null default >>'f'::character varying >> file_user_name | text | >> file_user_uid | bigint | not null >> file_backup | boolean | not null default true >> file_display | boolean | not null default false >> file_restore_display | boolean | not null default false >> file_restore | boolean | not null default false >>Indexes: >> "file_info_2_display_idx" btree (file_type, file_parent_dir, file_name) >> >> 'file_name' and 'file_parent_dir' are the columns that could have >>entries with the invalid unicode characters. Maybe I could/should use >>something other than 'text'? These columns could contain anything that a >>file or directory name could be. >> >> Thanks! >> >>Madison >> >>-- >>-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- >>Madison Kelly (Digimer) >>TLE-BU, The Linux Experience; Back Up >>http://tle-bu.thelinuxexperience.com >>-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- ---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq