On Fri, Dec 3, 2010 at 6:09 AM, Richard Quadling <rquadling@xxxxxxxxx>wrote: > Hi. > > OK. I'm stuck. I just can't work this out. > > I like SPL. I like the re-usability. It seems right. But I just can't > work it out. > > I want to read through a file system, looking for files of a particular > type. I want to ignore and not traverse any folder with a specific name > anywhere in the directory tree. > I want to examine the contents of the file with a set of regular > expressions and call a closure for each match for each regex which > will replace and replace the file with the amended one. > I want to get a list of the files altered along with which regex and > the number of changes. > That could work with DirectoryIterator, a subclass thereof, or a combination of DirectoryIterator and FilterIterator. > The part I'm really stuck on is working out what elements of the SPL > fit my issue the best and what bits I have to implement myself, with > regard to the file system iteration and filtering. > You have a few options as I mentioned above. Depends on whether you want to encapsulate the logic you're mentioning or if you just want to use it in one place. The simplest way to go is DirectoryIterator, driving the iteration via foreach, and your custom logic goes inside the body of the foreach. foreach(new DirectoryIterator('.') as $oFile) { // custom, one-off logic here // ex. skip directories w/ a given name if($oFile->isDir() && $oFile->getFilename() == 'LeaveMeAlone') continue; } Issue here is the logic portion is not very re-usable. You could easily move the logic into a subclass though if you want to have re-usable logic. Most likely this will end up in the current() function, however, you may want to consider dividing up the logic between a DirectoryIterator subclass and a FilterIterator implementation. FilterIterator lends itself to the 'should I operate on this node' question, so when you're talking about looking for files of a certain type and folders of a certain name, that sounds like it should go in a FilterIterator. However, at a cursory glance, if you want to skip entire sets of children based on the name of a given directory, you'll likely need to iterate a RecursiveDirectoryIterator yourself of subclass RecursiveIteratorIterator. For the other part, w/ the closure, changing files and tracking which files have been changed, that sounds best for a DirectoryIterator subclass. And if you're planning on handling multiple directories, w/ nested children, go for a RecursiveDirectoryIterator. > The tree is pretty large - just over 5,000 files with 5,700 > directories. The deepest file is 12 levels down. > > > Now I know someone could supply a solution not using the SPL. I've got > one of those. But trying to use the SPL just seems to awkward. Too > many choices. Too many things with the same name > > Iterator, RecursiveIterator, RecusiverIteratorIterator. > Once you've messed with the classes for a bit, the choices become much more natural and obvious, I too was overwhelmed by the sheer number of classes when I first had a look. > An Iterator suggests that it can work on a simple array ($argv for > example). A non SPL variant would be foreach(). > A RecursiveIterator suggests it can iterator recursively on nested > data (a file system for example). Using a function which uses > scandir() and calls itself for a directory would look like a recursive > iterator. > > So what does a RecursiveIteratorIterator do? Documentation says "Can > be used to iterate through recursive iterators.". > The RecursiveIteratorIterator is best thought as a tool to 'flatten' out a hierarchal data structure. So imagine having 1 foreach block and every node in the tree magically gets driven through it at some point. Without RecursiveIteratorIterator you have to be bothered to make calls to the hasChildren() / getChildren() methods of the RecursiveIterator, and then iterate though the child Iterators yourself. RecursiveIteratorIterator just packages that logic up in a standard way. > But that sounds daft. The RecursiveIterator is the handling the > parent/child/tree navigation (isn't it - if not why not?) > > > The name really doesn't help me to understand what's going on or what its > for. > The docs are a little rough, but manageable after you've messed w/ the classes a bit, but I know what you mean. > Is there any place that can describe the SPL in a different way. > Head to the homepage of the original SPL docs, http://www.php.net/~helly/php/ext/spl/ then search for "Some articles about SPL" these are quite useful when getting started w/ the Iterators. I know it's probably just me, but it really seems like I'm only just > scratching the surface. And not really getting anywhere. > I mean, I don't know anything but am glad to help where I can. Not everyone cares for Iterators, but they're a great way to package re-usable logic and manage OO collections. Of course I doubt PHP will ever have type safe collections :O Here's a quick example of some of the things I've mentioned in this post <?php class MyRIIterator extends RecursiveIteratorIterator { public function callHasChildren() { if($this->current()->isDir() && $this->current()->getFilename() == 'dontTouch') return false; return parent::callHasChildren(); } } class SkipPHPFilesFilterIterator extends FilterIterator { public function accept() { if(strrpos($this->current()->getFilename(), '.php') !== false) return false; return true; } } foreach(new SkipPHPFilesFilterIterator(new MyRIIterator(new RecursiveDirectoryIterator('.'), RecursiveIteratorIterator::SELF_FIRST)) as $oFile) echo $oFile->getFilename() . PHP_EOL; ?> here's a directory structure you can use to test w/, just put the php file containing the above code in the test-iterator directory. you should see the 'meh' file is omitted from the results, as it resides in the 'dontTouch' directory. you should also see the .php file omitted, by the FilterIterator. mkdir test-iterator mkdir test-iterator/a mkdir test-iterator/b touch test-iterator/b/care touch test-iterator/a/more touch test-iterator/a/blah mkdir test-iterator/dontTouch touch test-iterator/dontTouch/meh -nathan