Re: Renaming all variables in a repository

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Eddie Drapkin wrote:
> Hey all,
> we've got a repository here at work, with something like 55,000 files
> in it. For the last few years, we've been naming $variables_like_this
> and functions_the_same($way_too).  And now we've decided to switch to
> camelCasing everything and I've been tasked with somehow determining
> if it's possible to automate this process.  Usually, I'd just use the
> IDE refactoring functionality, but doing it on a
> per-method/per-function and a per-variable basis would take weeks, if
> not longer, not to mention driving everyone insane.
> 
> I've tried with regular expressions, but I can't make them smart
> enough to distinguish between builtins and userland code.  I've looked
> at the tokenizer and it seems to be the right way forward, but that's
> also a huge project to get that to work.
> 
> I was wondering if anyone had had any experience doing this and could
> either point me in the right direction or just down and out tell me
> how to do it.

Hi Eddie,

That's quite the task :).

You're going to need to scan the source to generate a list of every
variable and function name using the tokenizer.  Fortunately, this is
easy - with the caveat that if you do this anywhere in your source:

$a = $this->{$constructed . '_name'}();

you will have to handle these manually.

Basically, run token_get_all() on the source, scanning for T_VARIABLE,
and record every T_VARIABLE in an array.  Then, scan for:

1) T_FUNCTION T_WHITESPACE* T_STRING
2) T_OBJECT_OPERATOR T_WHITESPACE* T_STRING

<?php
$replace = array();
foreach (new RegexIterator(new RecursiveIteratorIterator(new
RecursiveDirectoryIterator('/path/to/src')), '/\.php$/',
RegexIterator::MATCH, RegexIterator::USE_KEY) as $path => $file) {
$source = file_get_contents($path);

$checkForID = false;
$var = false;
$last = '';
foreach (token_get_all($source) as $token) {
    if (!is_array($token)) continue;

    if ($checkForID) {
        if ($token[0] == T_WHITESPACE) {
            $last .= $token[1];
            continue;
        }
        if ($token[0] != T_STRING) {
            $checkForID = false;
            $last = '';
            continue;
        }
        $token[1] = $last . $token[1];
    } elseif ($token[0] == T_FUNCTION || $token[0] == T_OBJECT_OPERATOR) {
        $checkForID = true;
        $last = $token[1];
        continue;
    } elseif ($token[0] == T_STRING) {
        if (function_exists($token[1])) {
            continue; // skip internal functions
        }
        if (strtolower($token[1]) != $token[1]) {
            continue; // assuming you UPPER-CASE constants, this skips them
        }
    } elseif ($token[0] != T_VARIABLE) {
        continue;
    }

    // we get to here if we've found one to process
    $new = explode('_', $token[1]);
    $new = array_map('ucfirst', $new);
    $new[0] = lcfirst($new); // for your camelCasing

    $new = implode('', $new);
    $replace[] = array($token[1], $new);
?>

Next, load each file (you should use RecursiveIteratorIterator with a
RecursiveDirectoryIterator and some kind of filter, probably
RegexIterator, to grab the PHP source files), and then iterate over the
list of variable names somewhat like this:

<?php
foreach (new RegexIterator(new RecursiveIteratorIterator(new
RecursiveDirectoryIterator('/path/to/src')), '/\.php$/',
RegexIterator::MATCH, RegexIterator::USE_KEY) as $path => $file) {
    $source = file_get_contents($path);
    foreach ($replace as $items) {

        $source = str_replace($items[0], $items[1], $source);

        if ($items[0][0] == '$') {
            $source = preg_replace('/->(\s*)' . substr($variable, 1) . '/',
                                   '->\\1'substr($new, 1),
                                   $source);
        }
    }
    file_put_contents($path, $source);
}
?>

Voila, code refactored.

I trust you know this, but don't run that example code without testing
it on a limited sandbox and comparing the results first :).  I did not
test anything except the regexiterator part to make sure that it
actually grabbed PHP files, the rest is based on my experience
tokenizing for parsing PHP when writing tools like phpDocumentor.

If I made any mistakes, it would be good for you to post your final
scripts for posterity back on here.

Greg

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux