On Sat, 2011-01-08 at 16:55 +0800, WalkinRaven wrote: > PHP 5.3 PCRE > > Regular Express to match domain names format according to RFC 1034 - > DOMAIN NAMES - CONCEPTS AND FACILITIES > > /^ > ( > [a-z] | > [a-z] (?:[a-z]|[0-9]) | > [a-z] (?:[a-z]|[0-9]|\-){1,61} (?:[a-z]|[0-9]) ) # One label > > (?:\.(?1))*+ # More labels > \.? # Root domain name > $/iDx > > This rule matches only <label> and <label>. but not <label>.<label>... > > I don't know what wrong with it. > > Thank you. > I think trying to do all of this in one regex will prove more trouble than it's worth. Maybe breaking it down into something like this: <?php $domain = "www.ashleysheridan.co.uk"; $valid = false; $tlds = array('aero', 'asia', 'biz', 'cat', 'com', 'coop', 'edu', 'gov', 'info', 'int', 'jobs', 'mil', 'mobi', 'museum', 'name', 'net', 'org', 'pro', 'tel', 'travel', 'xxx', 'ac', 'ad', 'ae', 'af', 'ag', 'ai', 'al', 'am', 'an', 'ao', 'aq', 'ar', 'as', 'at', 'au', 'aw', 'ax', 'az', 'ba', 'bb', 'bd', 'be', 'bf', 'bg', 'bh', 'bi', 'bj', 'bm', 'bn', 'bo', 'br', 'bs', 'bt', 'bv', 'bw', 'by', 'bz', 'ca', 'cc', 'cd', 'cf', 'cg', 'ch', 'ci', 'ck', 'cl', 'cm', 'cn', 'co', 'cr', 'cu', 'cv', 'cx', 'cy', 'cz', 'de', 'dj', 'dk', 'dm', 'do', 'dz', 'ec', 'ee', 'eg', 'er', 'es', 'et', 'eu', 'fi', 'fj', 'fk', 'fm', 'fo', 'fr', 'ga', 'gb', 'gd', 'ge', 'gf', 'gg', 'gh', 'gi', 'gl', 'gm', 'gn', 'gp', 'gq', 'gr', 'gs', 'gt', 'gu', 'gw', 'gy', 'hk', 'hm', 'hn', 'hr', 'ht', 'hu', 'id', 'ie', 'il', 'im', 'in', 'io', 'iq', 'ir', 'is', 'it', 'je', 'jm', 'jo', 'jp', 'ke', 'kg', 'kh', 'ki', 'km', 'kn', 'kp', 'kr', 'kw', 'ky', 'kz', 'la', 'lb', 'lc', 'li', 'lk', 'lr', 'ls', 'lt', 'lu', 'lv', 'ly', 'ma', 'mc', 'md', 'me', 'mg', 'mh', 'mk', 'ml', 'mm', 'mn', 'mo', 'mp', 'mq', 'mr', 'ms', 'mt', 'mu', 'mv', 'mw', 'mx', 'my', 'mz', 'na', 'nc', 'ne', 'nf', 'ng', 'ni', 'nl', 'no', 'np', 'nr', 'nu', 'nz', 'om', 'pa', 'pe', 'pf', 'pg', 'ph', 'pk', 'pl', 'pm', 'pn', 'pr', 'ps', 'pt', 'pw', 'py', 'qa', 're', 'ro', 'rs', 'ru', 'rw', 'sa', 'sb', 'sc', 'sd', 'se', 'sg', 'sh', 'si', 'sj', 'sk', 'sl', 'sm', 'sn', 'so', 'sr', 'st', 'su', 'sv', 'sy', 'sz', 'tc', 'td', 'tf', 'tg', 'th', 'tj', 'tk', 'tl', 'tm', 'tn', 'to', 'tp', 'tr', 'tt', 'tv', 'tw', 'tz', 'ua', 'ug', 'uk', 'us', 'uy', 'uz', 'va', 'vc', 've', 'vg', 'vi', 'vn', 'vu', 'wf', 'ws', 'ye', 'yt', 'za', 'zm', 'zw', ); if(strlen($domain <= 253)) { $labels = explode('.', $domain); if(in_array($labels[count($labels)-1], $tlds)) { for($i=0; $i<count($labels) -1; $i++) { if(strlen($labels[$i]) <= 63 && (!preg_match('/^[a-z0-9][a-z0-9 \-]*?[a-z0-9]$/', $labels[$i]) || preg_match('/^[0-9]+$/', $labels[$i]) )) { $valid = false; break; // no point continuing if one label is wrong } else { $valid = true; } } } } var_dump($valid); This matches the last label with a TLD, and each label thereafter against the standard a-z0-9 and hyphen rule as indicated in the preferred characters allowed in a label (LDH rule), with the start and end character in a label isn't a hyphen (oddly enough it doesn't mention starting with a digit!) Also, each label is checked to ensure it doesn't run over 63 characters, and the whole thing isn't over 253 characters. Lastly, each label is checked to ensure it doesn't completely consist of digits. I've tested it only with my domain so far, but it should work fairly well. As I said before, I couldn't think of a way to do it all with one regex. It could probably be done, but would you really want to create a huge and difficult to read/understand expression just because it's possible? Thanks, Ash http://www.ashleysheridan.co.uk