Re: Tidy HTML source?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



----- Original Message ----- From: "Paul Novitski" <paul@xxxxxxxxxxxxxxxxxxx>


At 11/28/2006 05:05 AM, Satyam wrote:
May I invite you to check http://satyam.com.ar/pht/? This is a project I started some time ago to help me produced HTML in a more clean and efficient way. Usually, producing good HTML involves running a sample output through some HTML validator or looking at the HTML and checking it out by hand, which, of course, requires good formatting to make it understandable. In the case of too dynamic HTML (meaning, output can vary widely) it is impossible to produce enough samples of all the possible outputs to get them checked out.

So, my idea was to embed HTML into the language itself so that the final output could be checked at 'compile time' and even before, using the standard tools provided by the IDE (even just matching braces goes a long way into checking for missmatched HTML tags).


Satyam,

That's an impressive bit of work, congratulations.

It's interesting to see someone spend such energy merging PHP logic with HTML. I've gone in the opposite direction, separating the two as much as possible. My own CMS merges content with HTML based on CSS-style selectors so that the logic layer of my applications doesn't need to know -- or contain -- the full details of the markup. I find this a natural and agreeable extension of the move to separate HTML markup from CSS presentation and JavaScript behavior.


It's interesting to note that for all your effort to generate good, clean HTML, you're still able to generate a div nested inside a table:

<table for ( $i=1; $i < 10; $i++) {
    <div {
       &style = ($i & 1?"odd":"even");
        <tr {

Ouch!  According to the spec, this is an illegal structure:
http://www.w3.org/TR/html4/struct/tables.html#h-11.2.1

Your pre-compiler has ensured that all your tags are well-formed, but it doesn't ensure that you've followed the rules of correct markup. Perhaps a future iteration of your software will incorporate more HTML structural rules and will give you precompiler errors in such cases.


That would be something to be dealt with in the second part of the project, that of validating the HTML output against the DTD. Actually, the pre-compiler is meant for generic XML, thus it does not know about divs and tables, it just knows about how to output properly formatted XML.


You write:

So, we have two well structured languages, one is procedural (any flavor of C, JavaScript, PHP, Perl, etc.), the other descriptive. Their blocks are quite compatible: to start with, they nest nicely within each other. If an XML block is contained within an if() block, it has to be completely within it, the boundaries of their blocks should not overlap.

Help me understand the relevance of this statement. A very common pattern in a mixed logic/HTML script goes like this:

        echo '<ul>';

        foreach (array as item)
        {
                echo '<li>item</li>';
        }

        echo '</ul>';

In these cases the boundaries of the HTML block do in fact overlap the boundaries of the foreach() logic block.


Perhaps 'overlap' is not the right word, I mean that one is fully contained within the other. The braces of the foreach are fully enclosed by the ul and the li tag is fully within the braces.

And, of course, I would appreciate any comment on the project, EXCEPT that you use template engines and that you do not generate HTML directly. I've heard that and it is completely missing the point so, please, spare me that one. At one point or another plain HTML has to be generated.

Unless I'm missing the boat, it seems to me that the primary advantage of your precompiler is that it enables you to close HTML tags simply by closing braces, a convention policed by your pre-compiler and the PHP interpreter itself, so that you'll get pre-compiler or interpreter errors for incorrect closure instead of waiting for the W3C validator to check your work. I don't mean to minimize the significance of your accomplishment, but personally I don't find generating accurate markup to be a great problem. I'm a careful hand-coder, and true to the topic of this thread I find that neatly-indented HTML helps me validate my own markup. Online validators help me catch any errors I miss.

Without the second part, validating against the DTD, yes, it doesn't do much more than you say.


What I find to be a much greater problem is the human readability of logic code when HTML is mixed throughout. Your innovation is helpful here, as you're nearly making HTML tags into PHP key words, eliminating some of the literal quoting that makes PHP+HTML so tiresome. However, even with your pre-compiler the messy quotes are still there on the attribute level.

The value of the attribute is any valid PHP expression and if they are literal strings there will be quotes, but then, you can also use heredoc if, for example, you are putting some JavaScript code into an event. If the value is a numeric value, there is no need for any quotes at all. In the end, the value of an attribute is any valid PHP expression and it follows PHP rules.

And, stepping back, you're perpetuating the embedding of markup with logic so that it will still take a PHP programmer to modify the markup of one of your pages. Do you not see the advantage in separating the two layers?


Yes, I do, and I would recommend using templates or similar tools to provide for separation of code and markup, but sometimes there are reasons not to do so, for example, web services. Though the examples are in HTML so I don't have to explain the semantics of an arbitrary markup language, it is actually meant for XML where there is no presentation layer at all. Lots of packages, (Wordpress, many picture galleries and CMSs), don't do much separation of layers at the HTML / PHP interface. They do have separate modules for presentation and for program logic and also data handling, but the presentation layer is not plain HTML with template tags but more of a mix of PHP /HTML so this would still work for those.

Again, in spite of this criticism I'm impressed with your effort. Good work!

I appreciate your comments and, I admit, my main purpose in doing this was to learn how to do it. I am an engineer and when I studied, a couple of semesters of Fortran IV was all I got (and punching cards at that, yes, I am that old), all the rest was self-taught so I wanted to go deeper into some aspects of computer science such as compilers (there is also a PHP grammar for JavaCC which I made earlier in the process).

In fact, my original idea was some sort of embedded SQL as it exists for C, but I know it does not work quite Ok, in fact, it has been there for quite some time and it doesn't catch up. SQL is such a different kind of beast that it is hard to make it compatible. SQL cursors and error handling are concepts which are hard to blend into a procedural language so I believe it is better to handle SQL through functions where it is clearly separate from the language calling them. Thus, I thought, we have three main languages here, HTML, PHP and SQL. I know PHP and SQL don't mix well, how about the other end? That's when I started to think about this pre-compiler and found it to be a pretty logical mix.

Cheers and thanks again for your comments.

Satyam



Regards,
Paul

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux