Validating a Customized DTD – A Listing Aside

In his article on this problem, Peter-Paul Koch proposes
including {custom} attributes to type parts to permit triggers for specialised
behaviors. The W3C validator received’t validate a doc with these
attributes, as they aren’t a part of the XHTML specification.

Article Continues Beneath

This
article will present you find out how to create a {custom}
DTD that can add these
{custom} attributes, and can present you find out how to validate paperwork that use these
new attributes. Here’s a pattern of the HTML with the {custom} attributes that
allow us to specify the utmost size of a textual content space and whether or not a type ingredient
is required or not:

<type>
<p>
  Title:
  <enter sort="textual content" identify="yourName" dimension="40" />
</p>
<p>
  Electronic mail:
  <enter sort="textual content" identify="e-mail" dimension="40"
  <span class="spotlight">required="true" />
</p>
<p>
  Feedback:
<textarea <span class="spotlight">maxlength="300" required="false" rows="7" cols="50"></textarea> </p> <p> <enter sort="submit" worth="Ship Information" /> </p> </type>

A Document Type Definition (DTD) is a file that
specifies which parts and attributes exist in a markup language and
the place they’ll seem. Thus, the XHTML DTD specifies that
<p> is a sound ingredient, and that it may well seem
inside a <div>, however not inside a <b>.
The URL on the finish of your DOCTYPE declaration factors
to a spot the place you can find the DTD for the flavour of HTML you’re
utilizing. Neither your browser nor the W3C Validator goes out to the net to search out
the DTD — they’ve a “wired-in” checklist of the legitimate
DOCTYPEs and use the URL for identification functions solely. As you will notice
later, it will change while you make a {custom} DTD.

Specifying the attributes#section3

Including attributes to an present DTD is simple. For every attribute, you
have to specify which ingredient it goes with, what the attribute identify is,
what sort of values it could have, and whether or not the attribute is non-obligatory or
not.  This info is specified on this mannequin:

<!ATTLIST
  elementName attributeName sort optionalStatus
>

So as to add the maxlength attribute to the
<textarea> ingredient, you write this:

<!ATTLIST textarea maxlength CDATA #IMPLIED>

The CDATA specification implies that the attribute worth
can include any outdated character knowledge you please; thus
maxlength=“300” or maxlength=“ten” will each
be legitimate. For “open-ended” knowledge, DTDs don’t allow you to
get extra particular.  The #IMPLIED specification implies that
the attribute is non-obligatory.  A required attribute would specify
#REQUIRED.

When you might have a listing of potential values for an attribute, chances are you’ll specify
them within the DTD.  That is the case with the attribute named
required,
which has the values true and false. The values
are case delicate; on this instance solely the lowercase values are specified, so
a price of TRUE wouldn’t be thought of legitimate.

<!ATTLIST textarea required (true|false) #IMPLIED>

Confusion alert! This attribute is named “required,”
however you don’t should put it on each <textarea>
ingredient, so it’s an non-obligatory attribute.

The attribute named required must also be obtainable to the
<enter> and <choose> parts. All
in all, the specs to switch the DTD seem like this:

<!ATTLIST textarea maxlength CDATA #IMPLIED>
<!ATTLIST textarea required (true|false) #IMPLIED>
<!ATTLIST enter required (true|false) #IMPLIED>
<!ATTLIST choose required (true|false) #IMPLIED>

Word: Including new attributes to present
parts is simple; including new parts is considerably tougher and past
the scope of this text.

Inserting the attributes#section4

Now that you simply’ve outlined the {custom} attributes, how do you place
them the place a validator can discover them?  The easiest place to place them
could be because the
inner subset
straight in your doc:

<!DOCTYPE html PUBLIC
"-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"
[
  <!ATTLIST textarea maxlength CDATA #IMPLIED>
  <!ATTLIST textarea required (true|false) #IMPLIED>
  <!ATTLIST input required (true|false) #IMPLIED>
  <!ATTLIST select required (true|false) #IMPLIED>
]>

In case you run such a file by the W3C
validator, you discover that it validates splendidly effectively.
In case you obtain the pattern information for this text and validate
file inner.html, you may see this for your self.
Sadly,
while you show the file in a browser, the ]>
reveals up on the display.  There’s no means round this bug, so this
strategy is correct out.

An strategy that does workrequires you to acquire the
XHTML transitional DTD and add your modifications to that file.
The unique model of the DTD is file
xhtml1-transitional.dtd in listing dtd
from this text’s pattern information.  Additionally, you will discover
three information with the .ent extension in that
listing. These three information
outline all of the entities that you simply use in HTML,
corresponding to and ñ. You
have to hold all these information collectively in the identical listing.

The personalized file, named xhtml1-custom.dtd was
created by opening file xhtml1-transitional.dtd and
including the brand new attribute specs on the finish of the file. When
including attributes, you
need to add your customizations on the finish of the DTD to
be certain that all the things they should reference
has already been outlined.

Altering the DOCTYPE#section6

You will need to now change the <!DOCTYPE> in your HTML
file to point that you’re now utilizing this practice “taste”
of XHTML.
Because the {custom} DTD isn’t one of many publicly registered ones,
the DOCTYPE is not going to use the PUBLIC specifier. As a substitute,
you employ the key phrase SYSTEM adopted by the situation of the
{custom} DTD. This can be a relative or absolute path identify, or, in case your
DTD is on a server, a URL.  The trail should level to the place your
{custom} DTD actually is!
File {custom}.html within the pattern information for this text
makes use of a relative path identify:

<!DOCTYPE html SYSTEM
   "dtd/xhtml1-custom.dtd">

While you attempt to use the W3C validator on
{custom}.html, it rejects
the doc since you aren’t utilizing one of many validator’s
authorised DTDs.

Utilizing a unique validator#section7

The answer is to make use of a unique validator which can truly go
out to the URL that you’ve got specified and use it to examine whether or not your
doc is legitimate or not.
As a result of the doc you’re validating is XHTML,
you need to use any XML parser that
does validation. This text will makes use of the Xerces parser,
obtainable from
xml.apache.org.  This parser is written in
Java™, so you’ll need to have Java put in in your system.
While you unzip the Xerces obtain file, it’s going to create a listing named
xerces-2_6_2 (or no matter model is present).  Within the
following textual content, the idea is that you’ve got unzipped it to the highest
degree of the
C: drive on Home windows or to /usr/native on Linux.

One of many pattern information that comes
with Xerces is the Counter program. This program
counts the variety of parts,
attributes, ignorable whitespaces, and characters showing in
an XML (or, on this case, XHTML) doc. This program has an possibility
to activate validation because it parses the doc, making it good for
the duty at hand.
You run the Counter program (which goes to be your
“validator”) from
a batch file for Home windows or a shell script for Linux.
Right here is the
batch file, named
validate.bat.
It’s all on one line, however proven right here break up throughout traces to
match on the web page. Please be aware: there’s a clean earlier than the phrase
dom and after the -v.

java -cp c:xerces-2_6_2xercesImpl.jar; »
c:xerces-2_6_2xmlParserAPIs.jar; »
c:xerces-2_6_2xercesSamples.jar dom/Counter -v »
%1 %2 %3 %4 %5 %6 %7 %8

Right here is the Linux shell script, named validate.sh.

java -cp /usr/native/xerces-2_6_2/xercesImpl.jar:
/usr/native/xerces-2_6_2/xmlParserAPIs.jar:
/usr/native/xerces-2_6_2/xercesSamples.jar 
dom/Counter -v $1 $2 $3 $4 $5 $6 $7 $8

After all, you probably have unzipped Xerces to a unique location, you
should change the trail names.
As soon as that is all arrange, you may validate the file
{custom}.html by typing
this on a Home windows command line:

validate {custom}.html

Or this at a Linux shell immediate:

./validate.sh {custom}.html

In case your file is legitimate, you’ll obtain a message giving the
filename and a few statistics in regards to the file, like this:

{custom}.html: 543;50;0 ms
  (15 elems, 20 attrs, 9 areas, 43 chars)

If the file isn’t legitimate, you’ll get error messages as effectively.
For instance, in case you attempt to validate a file named badfile.html
which accommodates these errors:

<p>Electronic mail: <enter sort="textual content" identify="e-mail" dimension="40"
 required="<span class="spotlight">sure" /></p>
<p>Feedback:
<textarea maxlength="300" <span class="spotlight">inquirer="false" rows="7" cols="50"></textarea>

You’ll get this output from the validator:

[Error] badfile.html:12:70: Attribute "required"
  with worth "sure" should have a price from the
  checklist "true false ".
[Error] badfile.html:14:63: Attribute "inquirer"
  should be declared for ingredient sort "textarea"
badfile.html:
  611;82;0 ms (15 elems, 20 attrs, 9 areas, 43 chars)

One other validation methodology#section8

If you’re utilizing the
jEdit editor,
chances are you’ll obtain the XML plugin. In case you identify your file with the
extension .xhtml, jEdit will validate utilizing your {custom}
DTD as specified within the DOCTYPE.

It’s simple to specify further attributes for XHTML parts; with just a little
bit of labor, you may arrange a validator to examine your information in opposition to your
{custom} model of HTML.  Obtain all of the
pattern information from this text and provides it a whirl.

Leave a Comment