$Id: standards.html,v 1.16 2005-10-18 21:14:50 staylor Exp $
The goals of these standards are to
The hard part about standards is getting the right amount of them. Not enough and we won't meet our goals. Too many and we will spend all our time looking up standards, and arguing about them.
ZFIN strives to have just enough standards to meet our goals. We strive to make them easy to follow, which means we put a fair amount of time into making this document as readable as possible. We attempt to express standards (and non-standards in concise statements that are are shown in bold and in a distinct green color.
The emphasis of this document is on naming conventions, modularity and well defined interfaces, and comments.
These standards do not emphasize on the syntax or particular statements These standards generally do not specify the one true way to do something in the code.
In addition to describing coding standards that ZFIN has, this document also includes some discussion of standards that we have agreed not to have. In particular, if we spent time discussing a possible standard and decided that either the issue was too contentious or that it didn't matter then this document will explicitly state that we have no standard in that area.
This is a living document. New standards are needed as we gain more experience or adopt new technology. Some existing standards will need to be tweaked or discarded.
New and revised standards start life in this file as preliminary standards. Preliminary standards are subject to a one month feedback and revision process. At the end of that month, the preliminary standard gets promoted to a standard.
Preliminary standards are displayed with a special background color and are preceded with this comment:
This section is a preliminary (or revised) standard posted on YYYY/MM/DD. It is open to comments and revision until YYYY/MM+1/DD, when it will lose its preliminary status.
New and revised standards are added to this file
Once these standards have settled, I propose a quarterly review of them to figure out what needs to be changed, and then change it. Discussion can be tracked in FogBUGZ.
Enforcement of standards, like standards themselves, is a matter of striking a balance. We want just barely enough enforcement to ensure that standards are followed.
ZFIN considered two extremes of enforcement:
We settled on an approach that is one step up from self-enforcement.
ZFIN adopted these standards for code reviews:
When sending changes to other developers for review it is useful to include the following information.
For the set of files being changed:
For each file being changed:
For example,
Whenver possible, you should allocate sufficient time in your schedule for code reviews and any resulting rework. This is often not hard at ZFIN since many projects do not have hard deadlines. However, many projects do have hard deadlines that cannot be missed due to interaction with other organizations or because of upcoming events such as conferences or publications.
In such cases, development work can go right up to the dealine, not leaving enough time to do a code review before the code needs to go in. What should we do in these cases?
First, try to avoid this situation whenever possible. Really.
If a deadline looms and the changes need to go in, then commit them without code review, stating in the CVS message that the changes were not code reviewed due to deadlines.
Get the changes code reviewed within a week of committing them.
This section discusses coding standards and best practices that are independent of the particular language or technology being used. Often the standards given in this section are expanded on in some or all of the language specific sections. When working on code, both the general and the language specific standards must be used.
There are three main areas of emphasis in this document, and several minor areas. The three main areas have both general standards and language specific standards. The three major areas are:
ZFIN cares a great deal about naming conventions for almost everything you can think of:
Each language has its own set of naming conventions. See:
In languages where global variables are the only option (Web Datablade), or where the default scope for variables is global (PERL), then our naming conventions include standards that help us tell where any variable came from.
See also:
ZFIN code needs to be modular and it needs to have well defined interfaces. Our goals in this area include:
A well defined interface will specify these things:
How modularity is implemented varies from language to language. See:
Comments are a key part of the ZFIN standards. In addition to just making things more readable, they are key to well defined interfaces.
See also:
Comments are required at the top of each defined interface. This specifies what the file/module/program/routine/etc does, expects as input, produces as output, and what side effects it has. Each language has a different format for defining interfaces. See:
Indenting makes a file infinitely more readable. However, good indenting complements the practice of putting comments on closing tags/statements that are a long way from their matching opening tag or statement. Comments are required on some closing tags / statements that are more than 50 lines away from the matching opening tag or statement. The practice of using ending comments helps enormously on blocks of code that span a page or more.
See also:
Comments are recommended anywhere where it is not obvious from the code what is going on. If it took you some time to figure out what is going on then you should take the time to record your hard-earned knowledge in a comment.
Inline comments are love notes to your future self. Feel the love.
See also:
There are several types of inline comments that deserve particular attention and that are to be labeled with specific "flags" in the comment. This will allow us to search for this particular type of flag. The flags are
| :TODO: | Put this in the code when you have more to do here and don't ant to forget to do it. You should never commit a file that a :TODO: comment flag in it. |
| :KLUDGE: | Use when something isn't pretty, but it works. Flags things that could be improved on the next thorough examination of the code. |
| :PRODUCT_BUG: | Use this when code is working around a bug in a product. When the next release of the product comes out this will allow us to test if the problem has gone away. |
Comments are also useful as section separators in large files. Section separator comments are large enough to pop out at you when scrolling through a file in an editor or looking at a printed page. Each language has its own standard section separator comment format. See:
Our standard on indenting is that you must indent your code. Code that is not indented or is indented in a misleading way will not be accepted. However, we usually don't have specific standards telling you exactly how you should indent your code.
See also:
Do you have to reindent all the code whenever you add an extra condition, for example, around a large part of a file? No, you only need to reindent a file when it becomes hard to read. In order to avoid the need for constantly reindenting a file, we recommend that your initial indent be at least 4 characters wide. This gives you room to add at least one condition (and up to 3 if you squint) before you need to reindent a large portion of the file.
Also, large scale indenting changes to a file should be done as their own revision. That is, whenever possible, we should not mix logic changes and big indenting changes. They should be done as two separate revisions.
Configure your editor to use spaces instead of tab characters. There is no standard definition across tools on what a tab means, even on Unix.
Once we figure out how to do this in each editor in use at ZFIN, add a description here on how to do this.
Good indenting practice and good naming conventions are at odds with maximum line length standards. Good indenting and readable variable names leads to long lines. There are also some situations, particularly in HTML and JavaScript, where it is impossible to have short lines.
ZFIN does not have a maximum line length. However, short lines are favored over long lines whenever using short lines does not compromise readability. If you can easily fit things in 80 characters or less, please do so.
Any file that is part of the web site, that is used to produce the web site, or that is used to update the web site must be under CVS.
This includes things like FileMaker files, and load scripts. Historical load scripts that are not likely to be used again do not need to be loaded into CVS.
Files cannot be committed to CVS until the changes in them have been inspected by someone else. See Enforcement for details.
When committing a file 3 things must be provided.
Multi-tier applications like dynamic web sites are notoriously difficult to debug. Does the problem lie in the browser, the web server, HTML code, file permissions, app page tags, SQL syntax, database permissions, or in the database server? Often it is darn hard to tell.
Wherever possible, a standard approach should be taken to debugging in a given language. This means that
See also:
We have standards because they improve the quality of the code in the vast majority of cases in which they can be applied. If this weren't true of a standard then that standard should quickly become a non-standard.
However, even the best standards aren't applicable 100% of the time. If you find yourself in a situation where applying a standard lowers the quality of the code then you are free to ignore the standard as long as:
Newly written code should always conform to the ZFIN standards. But what about old code that was written before we had standards?
Here are some guidelines
This section describes standards and best practices for both HTML only pages (static pages) and for HTML code that is embedded in other languages, such as app pages. There may also be additional language specific standards for HTML that is embedded in other languages.
Style settings (font, font size, colors, etc.) should be moved from HTML files to the single global ZFIN Cascading Style Sheet. All HTML should use styles from that global style sheet. There will be no local style settings, either in other CSS files, or directly in the HTML file.
HTML tags should be all lower case. We could have picked all upper case as well. We picked all lower case because that is what XHTML uses.
Not all opening HTML tags require matching closing tags. Here are ZFIN's guidelines for when we should provide closing tags that HTML does not strictly require.
HTML attribute values should always be placed in quotes. The HTML standard requires this.
The following should be italicized whenever they are displayed:
Note that genes are the only types of markers that are italicized. All other marker types are shown as plain text.
The standard ASCII character set is 7 bits long. It includes every character that is needed in most of our pages. However, it does not include non-English characters such as "ü" or "à", or non-alphanumeric characters such as the copyright sign, "©".
Including non-ASCII characters directly in HTML pages can be done, but it is problematic. There is no guarantee that editors, and more importantly browsers, will render the characters correctly. A better approach is to replace all non-ASCII characters with HTML Character Entities.
All non-ASCII characters in HTML should be converted to HTML Character Entities before being checked into CVS.
HTML Character Entities serve (at least) two purposes in HTML:
HTML Character Entities have the format:
&mnemonic;mnemonic is a 2 to 8 character mnemonic code for the character. Complete lists of HTML Character Entities are available online at:
| The official specification: | http://www.w3.org/TR/REC-html40/sgml/entities.html |
| Nicer interface: | http://www.alanwood.net/demos/ent4_frame.html |
Some of the more popular HTML character entities are:
| Character | HTML Character Entity |
|---|---|
| < | < |
| > | > |
| & | & |
| ü | ü |
As stated earlier, comments are a developer's best friend. However, for static HTML pages comments should be used sparingly:
For dynamic pages, always use the comment syntax for the dynamic language, rather than HTML comments. These comments do not go out over the web.
Links within the ZFIN web site should be relative rather than absolute.
There are several reasons for this:
What is an absolute link?
Any link in an "a href" html tag that has "zfin.org" in it.
How do we make a non-absolute (relative) link?
Well, here are some examples: To link to static pages:
To link to a specific web datablade, dynamically generated page, use one of our generic tags (all generic tags are found in the .tt file in [yer] database's /Commons/env directory). An example:
If you think of "/ZFIN_WWW" , "/" , and "yerdb.zfin.org" as synonymous, finding the relative link to your file under your ZFIN_WWW directory will be easier.
A relative link points to a aliased spot in ZFIN's file system. An absolute link also points to an aliased spot. However, an absolute link specifies which copy of the ZFIN file system it is pointing to (namely, the zfin.org copy), while a relative link allows some other process (namely, the apache configuration file and the gmake/.tt files) to direct it to the appropriate copy.
Kevin will write this section.
This section describes database standards relating to using the database. For logical database design standards, see the Logical Design Conventions section of the Database Design document. For physical design standards see the Physical Design section of the Administration document.
This section describes some less than obvious best practices for writing SQL at ZFIN. For standards that are specific to SPL routines see the SPL section.
Informix supports two types of comments in SQL:
Multiline comments (braces) are a good way to comment out large chunks of SQL when debugging. They could also be used to comment out defined interface comments in SPL routines.
However, braces have one drawback: SQL mode in emacs does not recognize braces as SQL comments and does not display comments in braces as comments. Emacs displays them as SQL code, which means that highlighting and colors in multiline comments are displayed more or less randomly. Because of this, ZFIN recommends that multiline SQL comments use dashes on each line, rather than braces to enclose the whole comment.
Each SQL keyword should start a new line. This SQL statement needs to be rewritten:
Here is one possibility:
Table and column names should be in all lower case. ZFIN has no standard for capitalizing SQL keywords.
ZFIN does not have a standard join syntax. You can use either the ANSI join syntax where join conditions are specified in the FROM clause or the older syntax where the join conditions are in the WHERE clause.
Examples
Putting a distinct clause in a select statement causes the select to get rid of any duplicate rows before returning the results. It does this by performing a sort on the result set, a fairly expensive operation.
Unless you have a reason to use distinct clause in a select statement, don't use one. Use a distinct clause only if there is the potential for duplicates in the result set. This requires some additional work on the programmer. You need to examine the columns and tables in the query to determine if duplicates are possible.
When inserting a list of values into a table, always enumerate the columns in the insert statement. This makes it easier to keep insert code up to date with table changes.
Example
Whenever possible, use column names rather than column numbers in order by and group by clauses. If someone adds or drops columns from a select list then an order by or group by clause using column names will continue to work as expected. Dropping or adding columns in the select list in a query that uses column numbers may result in a query that continues to run, just not the way it was supposed to.
For example, don't do this:
The above query would quietly break if I added mrkr_zdb_id to the front of the select list. Instead, use the column name:
Subqueries can greatly improve the readability of certain queries, and subqueries are the only way to ask some questions. Use subqueries wherever they are appropriate.
When writing subqueries use "exists/not exists" instead of "in/not in", whenever possible. There are two reasons for this:
It makes it easier for the optimizer. "In/not in" queries can take an order of magnitude longer to run in certain cases. (See below for why.) In general, when writing complicated queries, try and make it easier for the optimizer to figure out what is really being asked
"not in" subqueries are not logically equivalent to "not exists" subqueries when the list of values returned by the subquery includes NULLs. If any value in the list returned is a NULL then the whole subquery condition will evaluate to false. (See below for why.) I can't think of a situation when this is the behavior you would want. Avoid this by always using not exists subqueries instead.
For example, this query:
should be rewritten as this query:
Q: Why can the "in" query be an order of magnitude slower than the "exists" query?
A: If we could run the "in" query completely unoptimized, it would do a linear scan of 1 to 3 of the inner tables (expression_pattern_image, expression_pattern_stage, expression_pattern_anatomy) for each matching row in the outer table (expression_pattern). This is in fact what happens in many cases.
The "exists" query more clearly states in the subqueries that we are looking for the existence of rows with a specific value. In the "exists" query, it is much easier for the optimizer to figure out that it can use the indexes on the inner tables, rather than linear scans.
Q: Why do NULLs in "not in" subqueries always cause the condition to be evaluated to false?
A: In SQL any condition involving a NULL value is assigned the value of "unknown", rather than true or false. It maintains that unknown state until the last possible part of the query when the unknown is promoted to false. In the case of "not in" queries, NULLs mean that the database can't tell if the value is in the list or not, because the values of the NULLs are unknown. That unknown will eventually become a false.
This is one of two ways to create temporary tables. However, there are several things you need to be aware of when using it. See the Temporary Tables section for more information.
This section discusses standards that are specific to stored procedure language (SPL) routines. SPL routines at ZFIN range from simple ones that are called by insert and update triggers to do data validation to very long ones used to generate fast search tables. Most of the SPL routines at ZFIN are defined in the lib/DB_functions/SPL directory. SPL routines that are specific to data loads/unloads are also found in the server_apps/data_transfer/ tree.
Variables in SPL routines should use mixed case to distinguish words in variable names. SPL supports underscores in variables but we avoid their use at ZFIN. This allows the reader to clearly distinguish between variables and column names.
There is also a more subtle reason for this standard: SPL allows you to define a variable with exactly the same name as a database column. SPL does not issue any warnings when you do this. This can lead to unexpected results because it is not clear in SQL statements if the column or variable is being used. Using mixed case instead of underscores in SPL variable names prevents this from ever happening.
Variables in SPL routines that hold data from database columns should use "like table.column_name" in the define, rather than explicitly stating the datatype. For example, this
is better than
Using "like" makes the code much easier to maintain when the definitions of columns change.
Every SPL routine should have a defined interface. The defined interface for an SPL routine occurs as comments at the top of the routine, just after the routine definition:
See the SQL Comments and SPL Defined Interfaces sections for standards on comments in SPL routines.
Unless you go through the effort to support it, debugging SPL is darn hard. Fortunately, ZFIN has already gone through the effort to support debugging SPL. The state of the art for debugging SPL and comments describing how it works can be found in regen_names.
All complicated SPL routines should include the debugging features described in regen_names.
Any script that inserts, deletes, or updates data in the database should be thoroughly tested before running in production. This includes custom test queries that are specific to that script, as well as running two standard testing scripts afterwards:
Run validatedata.pl with the daily, weekly, and orphan options. See the makefile for details. This will detect if the script has broken any of ZFIN's regular validation checks. Check the results to see if any reported item could come from the script you just ran.
Run scrubscan.pl to scan character columns for illigal characters. See the makefile for how to do this. Check the results to see if any reported item could have come from the script you just ran.
Temporary tables are widely used at ZFIN. This section describes general best practices for creating and using temp tables.
See the Temporary Tables in App Pages section for additional information on that topic.
There are two ways to create temp tables:
The datatypes of the columns in the temp table are the same as the datatypes of the columns and expressions in the select statement. This is generally what we want, but it can get us into trouble when constants or expressions are in the select list, particularly with character data when the constant is a placeholder for values that will be provided later, such as ZDB IDs.
ZFIN has a general policy that character data less than 255 characters long is stored in VARCHAR columns rather than CHAR columns. In addition, ZFIN has a function, scrub_char(), and a cron job, scrubscan.pl, that we use to trim/report leading and trailing blanks in all our VARCHAR columns.
SELECT INTO TEMP statements are the leading cause of trailing blanks in VARCHAR columns at ZFIN, particularly on ZDB ID columns. Here's why, and how to avoid them.
The statement
creates a 2 column table. The first column is destined to hold a ZDB ID. However, it is created as CHAR(50), which means that when we populate it with ZDB IDs they will have trailing spaces. If we forget to do a trim() or a scrub_char() (and we will) then the ZDB IDs will show up in the database with trailing spaces, will be reported by the nightly cron job, and will need to be manually fixed.
This statement creates the column as VARCHAR(50):
The ZDB IDs will not have trailing spaces when set in this temp table, and therefore will not have trailing spaces when inserted into permanent tables.
NOTE: There are several other possibilities that are simpler, but that do not work. In particular, these 2 approaches fail:
Why? In both cases Informix creates the column as VARCHAR(1). Why? Nobody knows.
Temp tables should always be created using a "with no log" clause at the end of the table definition. This suppresses logging for the temporary table.
Also note that temp tables can have indexes and constraints as well.
Web Datablade app pages, also known as just "the Web Datablade" or "app pages", were the initial motivation for drawing up these standards. The majority of ZFIN's dynamic pages have used the Web Datablade since ZFIN was started. The Web Datablade is an IBM Informix product for tying together the web and Informix databases. It uses an extra layer of markup language/tags embedded in the HTML. The Web Datablade tags are interpreted at run time by a module running inside Apache. Pages that contain the Web Datablade tags are called "app pages".
The syntax of app pages is not particularly pretty, but it is easy to understand. What is less easy to grasp are the subtleties of getting app pages to scale to a web site of our size. We have not done a particularly good job in the past at creating readable and maintainable app pages.
All app pages are stored under the ZFIN_WWW/home/ZFIN/APP_PAGES directory. That directory contains many subdirectories, each covering a different subject area or infrastructure function in the ZFIN web site. All app pages related to a particular topic are stored in that topic's directory. Some directories also have subdirectories to handle particular parts of that area.
Subject area directories, such as gene_files or personbrowser, cover specific subject areas in the web site such as genes or people. Infrastructure directories, such as navigation_files or security_files contain files that are used across many subject areas of the web site.
Directory names should accurately reflect their contents. This makes it easy to know where to look to see if an app page already exists for a particular purpose.
ZFIN does not have a standard on capitalization or underscores in directory names.
First some general app page file name standards
The most significant fact about app page variables is that all app page variables are global in scope. No matter how you WebExplode another app page (there are several ways), the called app page and the calling app page share the same variable name space. If you modify, create, or destroy a variable in either page then the other page knows about it as well.
Thus we need to take great care we need to take great care when naming app page variables so that
The main way we address these problems is that every app page has a unique 2 to 10 character variable name prefix associated with it, and every variable that is used in that page starts with that prefix followed by an underscore, followed by a descriptive name of the variable.
For example, the prefix for xpatselect.apg is xpatsel_. All variables in this file should (but currently don't) begin with this prefix.
There is another type of global variable - project global. In Java they would be Protected variables. These variables are shared between a top-level page and subroutine page(s). Either page may change the value and explicitly renaming the variable to pass into the subroutine page is not required. Subroutines list the global variables under INPUT VARS that it uses or modifies.
All project global variables should append G_ to the top-page prefix. For example, xpatselect.apg global variables would begin with the prefix xpatsel_G_ . Thus search by prefix will return all file specific variables.
We discussed if we should treat app page variables that are visible in URLs any different. We decided that we should not. They too should use app page prefixes in their name.
Finally, we talked about what to do with "global" variable names like $OID or any variable name that exists in secure_navigation.apg. We decide to leave these variables as they are, at least for the time being, as it is just too big a task to convert them.
Other than using app page prefixes, ZFIN does not have a standard for naming variables that either came from a database table or are destined for a database table.
We discussed the possibility of naming such variables $apgprfx_database_column_name, where database_column_name is the exact name of the column in the database. This is not a bad standard, and you are free to follow it, but there are many situations where you don't want to use it. For example, you might have a page that deals with genes and their associated ESTs. The ZDB IDs for both come from the mrkr_zdb_id column in the marker table. There are two problems here:
Pick database column variable names that make the code the most readable.
When query results are initially returned to an app page from an SQL statement, they are placed in variables named $1 through $n where n is the number of columns returned by the query. You should always assign $n variables to variables with standard names immediately after the SQL statement that produced them.
Exceptions: You don't have to do this if all these conditions are met:
Nothing is less clear than using a variable named $1 twenty lines after an SQL query. $1 is also in danger of being stomped on if we insert any SQL between the SQL that originally populated $1 and when $1 gets used.
Boolean variable names should clearly communicate that they are boolean variables. For example, apgprfx_fishIsMutant and apgprfx_idsAreEqual are preferred over apgprfx_mutant and apgprfx_same. Use "is" and "are" in boolean variable names.
Also see the section on app page boolean variable usage.
Some significant facts about app pages:
All of these facts have conspired to produce some pretty heinous app pages at ZFIN. We underutilize the WebExplode() function, resulting in 20 and 30 page long app pages. When we do use WebExplode() we have had such poor naming conventions for app page variables that we have no way to tell where a variable might have come from. Better use of WebExplode, naming conventions, and well defined interfaces will all help address these problems.
When discussing app pages, it is useful to classify them by how they are called:
An app page should be either a top level app page, or a subroutine app page, but never both.
You should create a subroutine app page in any of these situations:
An alternative to placing code/subroutines in app pages is to place the code in a Stored Procedure Language (SPL) routine inside the database. Here are some guidelines on when you might choose which technology.
However, there are some restrictions on what can be done in SPLs that are called from WebExploded app pages. In particular, you can run any DDL, other than to create temporary tables, and you also may not be able to insert or delete rows from tables.
Every piece of code we have at ZFIN, including app pages, should have a defined interface. Every app page must have this comment block at the top. If you make a non-trivial change to a page that does not have a defined interface comment block at the top then you must add one and document the parts of the interface that you are changing.
If your revision does not involve exhaustively researching the page's interface then you don't have to fully document the interface. For any section that is not fully documented, insert the statement
| INPUT VARS | Lists all input variables to this page. If the existence of a variable is optional, then this also describes what the non-existence of a variable means. |
| OUTPUT VARS | Descriptions of each output variable. If existence is optional
then includes a description of what non-existence means. If this
app page changes the existence or values of any input variables
then those variables must be listed as output variables as well.
Note that because of the global scope of app page variables that any variable created in this app page is in fact visible to the caller. However, only variables in the OUTPUT VARS list should be referenced by the calling page. |
| OUTPUT | Many app pages, and all top level app pages, produce HTML as output. This is a description of that HTML. This is a description of the $1 value returned by WebExplode(). |
This section covers how to use app page variables. Naming conventions for app page variables are covered elsewhere.
The Web Datablade does not have strongly typed variables. There is no boolean type in the Web Datablade. However, like Perl, C/C++ and Java, it does have a definition of True and False. Anything that is 0 is False, and anything that is not 0 is True. This differs from Informix itself which returns 'f' for false and 't' for true.
ZFIN uses a wide variety of conventions to represent boolean values in its app pages. This includes the built in 0/not 0 distinction, a plethora of t/f, T/F, True/False, etc. variations, and the existence or non-existence of a variable.
It would be nice if we could settle on a single way to represent booleans. However, because of the mismatch between Web DataBlade and the Informix DBMS, and because sometimes we have to use existence/non-existence, I don't think it is worth the effort. However, I think we should reduce the number of boolean representations down to these 3:
0 for false, not 0 for true. This is the representation Web DataBlade uses. Example:
'f' for false, 't' for true. This is the representation boolean columns in Informix have. Example:
Not exists for false, exists for true. We are forced to support this option with checkboxes, and it is darn convenient to use as well. Example:
In app pages we can use the existence or non-existence of a variable to communicate information. ZFIN does not have a standard on using the existence or non-existence of a variable to communicate information.
Note that when using checkboxes, existence/non-existence is the only way to communicate if the box was checked or not.
Also note that subroutine app pages must take care when using app page variable existence/non-existence internally. Subroutine app pages can be called multiple times within one transaction, and any variable created or destroyed on a previous call will be in that state in subsequent calls to that app page in the current transaction.
App pages support arrays in the form of vector variables. The use of vector variables is encouraged wherever applicable. A known problem with vector variables is that they can't be explicitly passed to subroutine app pages in WebExplode (they can be implicitly passed). Also, you can't use a vector variable in a URL.
This section describes some best practices for how to actually use the Web Datablade tags and functions (i.e., the Web Datablade markup language) in app pages.
You can use a COND clause with either MIBLOCK or MIVAR tags to control the execution of blocks of code in app pages. Here are the advantages and disadvantages of each.
If your code is logically using an "if-then-else" structure then use MIBLOCK followed by MIELSEs. This makes the structure of the code clearer.
If your code is logically using an "if-then" structure then you can use either MIBLOCK or MIVAR around the code. If you use MIBLOCK and the block of code contains app page variables then you will need at least one additional MIVAR tag around those variables.
You have to use the MIVAR tag to create and set the initial value of a variable, but you can use either the MIVAR tag or a $(SETVAR) function to update the variable. ZFIN has no standard on using MIVAR or $(SETVAR). Use whichever one you want.
See the general comments section for guidelines on using comments in all language. This section covers issues specific to app pages.
The most important comments in an app page are the defined interface comments. See the App Page Defined Interfaces section for details on this.
HTML and Web Datablade both have comment tags. HTML comments are left in the page by the Web Datablade and are sent out to the world. Web Datablade comments, which are wrapped in MICOMMENT tags, are stripped out by the Web Datablade and are not sent out to the world.
Use MICOMMENTs rather than HTML comments in app pages. There is no reason to be sending out comments to the world. If we need to communicate something to our users, they should see that in the displayed page. They should not have to look at the source code.
MICOMMENTs have the disadvantage that they don't stand out visually in the code as much as HTML comments. Therefore, any MICOMMENTs after the app page defined interface comment should have 3 leading asterisks on each line of the comment. Single line comments should also have 3 trailing asterisks.
For example:
Any closing Web Datablade tag that is more than 50 lines from its corresponding starting tag should have a comment on the closing tag indicating what it is closing. For example,
Since app pages tend to be long, section separator comments can help make them almost manageable. Here is one possibility for their format.
HTML in app pages should conform to the general HTML standards for static pages.
In addition, whenever possible, app page tags and functions and HTML tags should be correctly nested. In other words, it is bad practice to open an HTML tag, then open an app page tag, then close the HTML tag, and then close the app page tag. For example, don't do this:
The Web Datablade will accept this nesting, it is just hard to read. In most cases it is trivial to convert code to use a nesting that makes sense. Do this instead:
In some very complex cases it is not possible to do the correct nesting. If you find yourself in such a case, then the design of the code may be too complex.
This section discusses issues that are specific to using JavaScript in app pages. See the JavaScript section for general guidelines on writing JavaScript.
Kevin will write this section.
The ZFIN web site makes use of temporary tables in some of its web pages. However, starting with Informix 9.4, extreme care must be taken when using temporary tables in app pages that are web exploded by other app pages. In particular, you can still create and use temporary tables in such app pages, but you can no longer drop them. (See the discussion of MIqry2pass variable in the Web Datablade Application Developer's Guide for more information.)
We first tried using permanent tables and Informix session IDs to keep the different invocations of an app page separate. However, this approach got tripped up on -244 errors when the table was under high use.
Our second approach was more successful. We went back to using temporary tables, but we now include the session ID in the temp table name. Before creating a temp table, the code now checks if it already exists by calling the table_exists() database function. The app page also deletes all records from the table before returning.
App page specific tables are named
app_page_name_temp_description_sessionid
Like other ZFIN tables, each table has a unique prefix.
We do not have foreign key or other constraints on these tables. We don't think it is worth the overhead for records that will exist for less than a second.
Some example code that creates a temp table for the markerselect.apg page. First, the code that uses the session ID to set the table name.
Then the code that determines if the table already exists, and creates it if it does not:
And finally code to remove all records from the temp table before leaving the page.
An alternative would be to just leave the data in the temp table and then delete it the next time the page is invoked and the code finds that the table already exists.
ZFIN uses a standardized set of debug-related app page variables to turn debugging print statements on or off.
| $debug_sql | Complex dynamically constructed SQL statements will be displayed. |
| $debug_timing | How long it takes each part of a page to run will be displayed. This code makes extensive use of the get_time() function. |
| $debug_flow | Used to display flow of control information, such as "Decided fish is a wildtype". Often this can be inferred from the displays produced by the other debug variables, but sometimes it is clearer to have explicit flow of control debug information. |
These are enabled by defining the variables at the top of the page, or for some pages, they can also be passed in with the URL.
Be careful to strike a balance between keeping the code readable and fast (fewer debug variable tests and print statements) and making standard debugging useful (more debug variable tests and print statements).
We discussed more sophisticated methods for enabling debugging across all pages. (These included defining flags inside the database, or using the web datablade to define the variables across all pages.) We decided to try the simple approach first. If it irritates us enough, we will adopt more sophisticated techniques.
We also discussed using debug levels instead of or in addition to debug categories. We decided that categories gave us better control than levels for the same amount of work, and that adding levels to categories was more work than it was worth.
Most of ZFIN's dynamic pages use the Web Datablade app pages, but some of our dynamic pages use CGI. Someday, this section will describe standards that apply to all CGIs, no matter what language they are written in.
Each language that CGI pages are written in also has additional standards that apply to writing CGI. See
Use /private/bin/perl in all ZFIN PERL scripts. We do this for a couple of reasons:
When using perl, especially as a CGI,
always use the taint -T flag
and properly untaint all input before using it.