ZFIN Documentation:
Implementation and Development
This documentation is preserved for historical purposes, and NOT updated. Updated documentation is found here: http://almost.zfin.org/doc

$Id: impl.html,v 1.65 2006-03-30 23:17:38 peirans Exp $

Back to Table of Contents
Previous Section      Next Section

This document is one of several that describe the Zebrafish Information Network, or ZFIN. This document focuses on how ZFIN is implemented and how development and maintenance is done in ZFIN.

Big Picture

Architecture

The ZFIN web site is implemented using Apache for the web server, Informix for the database management system, and the Informix Web DataBlade product to generate dynamic pages. ZFIN also uses some CGI scripts that access the database as well.


ZFIN Implementation Big Picture

Development and Deployment

ZFIN uses CVS and gmake for development and deployment. We also use some custom scripts and configuration files to map files to different test and production web sites and databases. The remainder of this section gives a high level, somewhat graphical, overview of how we use CVS and gmake to develop and deploy changes. This material is covered in more detail in the ZFIN Development Environments section.

The explanations and diagrams here assume that you are creating the zezem.zfin.org development web site, and that the zezem web site has zezdb as its backing database.

Creating a Test Web Site

First, lets talk about creating your test web site. This section assumes that you have already created the backing database. This topic is covered in much more detail in the Creating a ZFIN Development Environment, An Example section. For brevity, some of the necessary steps that are covered in that section are omitted from this section.
Creating a test web site
Shell commands Purpose What happens
cd /research/zusers/username
mkdir zezem
Create a place to hold source files. (Directory can actually have any name.) mkdir
cd zezem
cvs checkout ZFIN_WWW
Get current copies of all source files and makefiles cvs checkout
cd ZFIN_WWW
gmake
Create the web site, and update the database. gmake

After doing the above, plus some other steps, you will now have a working web site.

Updating a File

The next step is to modify something, deploy it to your test web site, test it, and then commit it to CVS. This topic is covered in much more detail in the Updating a File, An Example section. For brevity, some of the necessary steps that are covered in that section are omitted from this section.
Updating a file
Shell commands Purpose What happens
cd /research/zusers/username cd zezem/ZFIN_WWW/somedir cvs edit somefile Tell CVS you are going to edit the file. cvs edit
emacs/vi/whatever somefile gmake Modify the file and then push modified version to zezem web site / DB. gmake
cvs commit somefile Put modifications into CVS where they can be picked up by others cvs commit

Deploying A Change to Production

The changes are now in CVS where they can be pushed to the prepoduction site, and then to the production site. This topic is covered in much more detail in the Putting Changes Into Production section. This section only shows the last part, putting the changes into production.
Putting change into production
Shell commands Purpose What happens
cd /research/zprod/users cd zfin.org/ZFIN_WWW cvs -q update -dP Pick up modified file from CVS cvs update
gmake Apply changes to production web site / database gmake

Machines and Informix Engines

ZFIN has several servers, each with its own Informix Engine running on it. On all the ZFIN servers, there is a 1:1 correspondence between machines and Informix engines. Multiple databases can be run under one Informix engine. On the production server we only have one database, but on the development server we have many databases.

Each Informix engine has a corresponding C shell script that can be sourced to set up the needed environment variables for that engine. These scripts are in

/private/ZfinLinks/Commons/env/InformixEngineName

Note that there is some confusion about the word server. The machines that host Informix engines are clearly servers. The Informix documentation also refers to the Informix engines as servers. However, within the Informix community (and when dealing with IBM Technical Support) they are known as engines. Engine is the term used in this documentation. Server or machine will be used in this documentation to indicate actual machines.

Each Informix engine has a unique name that identifies it. They are named after mutants that start with "w".

Production Server

Helix usually hosts the production web site and the internal mirror site.

Wildtype is the name of the Informix engine running on helix. There is a single database running in wildtype.

The C shell script that can be sourced to define the environment variables needed to use wildtype is at

/private/ZfinLinks/Commons/env/wildtype

Helix is a Sun Fire V440 with 4 Sun UltraSparc IIIi 1.3 GHZ 64-bit CPUS, and 16 GB of memory. It is a hardware twin of the development server. The disk layout for helix is described in the Production Server Disk section. Helix became the production server for ZFIN in 2004/11, replacing chromix in that job.

Development Server

Embryonix is the main development machine for ZFIN. Every test ZFIN domain name maps to embryonix. This includes the preproduction/sandbox site. Embryonix also hosts the production web site when the production server is down.

Wanda is the Informix engine running on embryonix. It supports all of the development databases, and the beta test site database as well.

The C shell script that can be sourced to define the environment variables needed to use wanda is at

/private/ZfinLinks/Commons/env/wanda

Embryronix is a Sun Fire V440 with 4 Sun UltraSparc IIIi 1.3 GHZ 64-bit CPUS, and 16 GB of memory. It is a hardware twin of the production server. The disk layout for embryonix is described in the Development Server Disk section. Embryonix became the development server for ZFIN in 2004/03, replacing bionix in that job.

Upgrade Server

From 2001/02 to 2004/03, Bionix was the main development machine for ZFIN. Prior to that chromix was both the production and development machine.

Bionix is now used mainly to test software upgrades before applying them to the development and production servers. It is also used as a terminal server for ZFIN users to do tasks that don't involve an Informix database.

Wavy is the Informix engine on bionix. However, wavy is only up and running on those rare occasions when we are testing software upgrades. All development is done on the development server.

The C shell script that can be sourced to define the environment variables needed to use wavy is at

/private/ZfinLinks/Commons/env/wavy

Bionix is a Sun Enterprise 220R with 2 Sun UltraSPARC-II 450 MHz 64-bit CPUs, and 2 GB of memory. The disk layout for bionix is described in the Upgrade Server Disk section. bionix was acquired by ZFIN in 2000/10.

BLAST Server

As of 2005/02, ZFIN's BLAST server is embryonix, the ZFIN development server. We are in the process of moving it from the development server, to its own server, genomix, an Apple G5 Xserve cluster, running BioTeam's iNquiry software.

Genomix is unique among ZFIN servers. It is the only one that is not a Sun Solaris box. We also have distinct ways of maintaining and administering this server.

Genomix is an Apple G5 Xserve cluster with one head node, five compute nodes, and its own network switch to tie the head node and the compute nodes. The head node has two 2.0 GHz PowerPC G5 CPUs, 5 GB of DDR400 ECC memory, and 3 internal 250 GB Serial ATA drives, configured as RAID 5. The compute nodes all have two 2.0 GHz PowerPC G5 CPUs, 5 GB of DDR400 ECC memory, and one internal 80 GB Serial ATA drive.

FogBUGZ Server

FogBUGZ is a project management / defect tracking tool used by ZFIN to keep track ob bugs, new feature development, suggestions, and technical discussions. FogBUGZ runs on ZFIN's only Windows server, zfinwinserver1.uoregon.edu, which resides in Room 333 of the ZFIN office space. See the FogBUGZ section for more information on how FogBUGZ is used at ZFIN.

Non-Server Machines

ZFIN owns a number of machines that are used purely as clients. They are:
ZFIN Client Machines
poetix A Sun Workstation. This resides on some CS grad student's desk. Can be used as a terminal server.
Lots of Windows PCs If you are having troubles with the Windows machine on your desk, Tom, Kevin, and Brock are good places to start. After that you can try Mike McHorse or Don Pate in Neuroscience.
Lots of Macs If you are having trouble with the Mac on your desk, Erik and Dave F are good places to start. After that Mike and Don in Neuroscience may be able to help.

In addition ZFIN also owns an uninterruptable power supply (UPS) and a darn large rack in which all this equipment is mounted.

Past Servers

ZFIN has been around long enought to have a history. You will occassionally see references to servers that no longer exist. Here are the details on those machines.
Past ZFIN Servers
Zfishstix Zfishstix was the original ZFIN production server. It was powered down when chromix was brought online on 2000/03/15. It was not used after that. It was officially decommissioned and recycled in 2002/09.
Chromix Chromix was the production server after zfishstix. It was acquired by ZFIN in 1999 and it was the production server for almost 5 years. It was decommissioned in 2004/11. For most of that time, and for most purposes, it had more power than we needed. This changed in August 2004, when some critical mass of data, features, and users was reached. Overnight chromix was no longer up to the task. FogBUGZ case 452 chronicles our efforts to make chromix run faster during its last days. FogBUGZ case 454 describes how we picked chromix's replacement and then moved to it. Chromix was a Sun Enterprise 450 with 4 Sun UltraSPARC-II 300 MHz, 64-bit CPUs, and 4 gigabytes of memory. We upgraded from 1 GB of memory to 4 GB of memory in 2003.
Genetix Genetix was the original ZFIN development server. It stopped serving that purpose in 2000 when chromix took over that role. From then until genomix was powered down in 2002/06, it was used as a workstation. Genetix had a single Sun UltraSPARC 142 MHz 64 bit CPU, and 128 MB of memory. It was officially decommissioned and recycled in 2002/09.

Directories and Files

Web Sites

The files in the production web site are usually located at

/research/zprod/www_homes/zfin.org

The files to support the internal ZFIN mirror site are usually located at

/research/zprod/www_homes/mirror

The files in the other ZFIN web sites are located at

/research/zcentral/www_homes/VirtHost

Web sites also have source trees from which the web sites are actually built. See ZFIN Development Environments for details.

Web Site Directory Structure

The ZFIN web site directory tree contains everything (almost) that goes into making a ZFIN web site. The actual directory structure of a web site very closely parallels the source directory from which it was built. The structure described here is for both the web sites themselves and the source trees from which they are built.
Important directories in the web site directory structure
cgi-bin/, or
cgi-bin_VirtHost
The CGI bin directory for the web site. Contains all server side scripts that are directly executable through HTTP.
client_apps/ Contains all ZFIN applets. In other words this contains all executables that are run on the client side.
home/ This is the document root of the web site. All of the static web pages and the app pages are stored under this directory, as are all the images.
home/
  images/
Contains most of the generic graphics files used in the web sites. This directory does not contain pictures of specific mutants, fish features, or zebrafish at specific developmental stages. Rather, it contains icons and generic pictures of fish.
home/
  images/
    LOCAL/
This contains icons and graphics that are specific to the ZFIN site such as pictures of fish, or ZFIN logos. The other images directories contain generic icons that could be used by any site.
home/
  ZFIN/
Contains files that are used to access the database.
home/
  ZFIN/

    APP_PAGES/
This is the app page hierarchy containing all of the web pages that are used to access the ZFIN database. This contains mostly app pages, but also contains some HTML files as well. Note that the app pages that are actually run are defined in the database. The app pages defined in these directories are where the app pages in the database should be loaded from, but there is no guarantee that the two agree.
home/
  ZFIN/
    APP_PAGES/
      */
These directories contain the app page definitions.
home/
  ZFIN/
    misc_html/
Contains various HTML files that are used to support the app pages.
home/
  zf_info/
The static web page hierarchy. This part of the web site is mostly managed by Monte and Sherry. See the Static Web Page Updates section for details on how this is done.
home/
  zf_info/
    SEARCH_SITE/
The home page has a "Search this site" button that searches the static web pages on the ZFIN site for text strings. The web pages to support that are in this directory.
home/
  zf_info/
    anatomy/
Pages about zebrafish anatomy including the anatomical dictionary.
home/
  zf_info/
    dbase/
Public documents and published papers about the web site and the underlying database.
home/
  zf_info/
    images/
Images of zebrafish at specific stages of development. The images in this directory can be viewed as data, whereas the images in home/images can be viewed as graphic design and decoration.
home/
  zf_info/
    monitor/
Current and back issues of the Zebrafish Science Monitor.
home/
  zf_info/
    news/
Contains current news about the ZFIN site, and zebrafish related jobs and meetings.
home/
  zf_info/
    stckctr/
Pages describing the Zebrafish International Resource Center, also known as the stock center. These files are slowly being replaced by files on the ZIRC server.
home/
  zf_info/
    zfbook/
An online version of the University of Oregon Zebrafish Book, a document describing the care and feeding of zebrafish and how to do zebrafish research.
lib/ Contains libraries, object or class files that run on the server.
lib/
  DB_functions/
Defines functions that are loaded into the database and are callable from SQL. The functions are divided into subdirectory based on source language (zextend is the exception) rather than on a functional basis because of makefile issues. The source language of the function (either C or SPL) has a great deal of impact on the makefile. It was much easier to group them by language then deal with multiple languages in one makefile. See the Database Functions section for a description of each of the functions.
lib/
  DB_functions/
    C/
ZFIN has several database functions written in C. Functions that have been around since before 2001 are defined in the zextend directory. Functions that are more recent or that have been significantly modified are in this directory. This directory offers a cleaner implementation than zextend. In zextend, all of the functions are defined in a single file, and dropping and creating functions involves dropping and creating all of the functions in the one file. In the C directory each function has its own file.
lib/
  DB_functions/
    SPL/
Many of ZFIN's database functions are written in SPL, the Informix Stored Procedure Language. ZFIN's SPL routines are defined in this directory, including the regen functions.
lib/
  DB_functions/
    zextend/
This directory contains database functions that were written in C, before 2001. C functions added since then are defined in the C directory, where each function gets its own file. In zextend, all of the functions are defined in a single file, and dropping and creating functions involves dropping and creating all of the functions in the one file. See the directory's Makefile for details.
lib/
  DB_triggers/
Defines all the triggers in the database.
server_apps/ Applications that run on the server side, where we need a separate instance of the application for each database/web site.
server_apps/
  cron/
Defines the ZFIN crontab file and a means for starting it and stopping it.
server_apps/
  DB_maintenance/
Database maintenance scripts, including those related to backups.
server_apps/
  sysexecs/
The name of this directory comes from the sysexec() database function (defined under lib/DB_functions). sysexec() is a C function that is used to call executables/scripts on the server from within SQL in app pages. This directory contains/defines executables that are called using the sysexec() function.
server_apps/
  WebSiteTools/
Contains web site management tools. It contains the checklinks link validity checker, and the signs of life script that runs every few minutes.

ZFIN Central

ZFIN Central is the home of all ZFIN source code, documentation, test data, development executables and scripts, and all of the test web sites. It contains almost everything you need to do develop and test a ZFIN web site. ZFIN Central does not contain the production web site or the actual test or production data or Informix executables.

ZFIN Central is at:

/research/zcentral

See the Disk Usage section for where ZFIN Central is physically located.
ZFIN Central Subdirectories
/research/zcentral/loadUp ZFIN development loadUp directories are stored here.
/research/zcentral/Commons Development documentation, environment, and /bin scripts (see details below).
/research/zcentral/data
/research/zcentral/ftp ZFIN development ftp directories
/research/zcentral/www_homes ZFIN development website target directories

ZFIN Central LoadUp directory structure

ZFIN Central LoadUpis at:

/research/zcentral/loadUp

Every night, files in the loadUp directories are checked against the production loadUp repository. If files exist in /research/zcentral/loadUp directories and not in /research/zprod/loadUp directories, those files are deleted (a backup copy is tagged with 'loadupbkup' and stored in /tmp) from the /research/zcentral/loadUp directories. If files are missing from /research/zcentral/loadUp that exist on /research/zprod/loadUp, those files are copied to development. Only files that have changed (rsync checks time/date stamp and filename) between production and development are updated (see rsync man pages for details).

Each of these subdirectories contains a /bkup directory. All directories in /loadUp/ and including /loadUp are owned by zfishweb:www. Zfishweb must own these directories, else the cgi that does the loading will fail. See the Apache configuration for more details.
ZFIN Central LoadUp Subdirectories
/research/zcentral/loadUp/imageLoadUp ZFIN development images stored here.
/research/zcentral/loadUp/PDFLoadUp ZFIN development PDF files stored here.
/research/zcentral/loadUp/embryonixLoadUp zfin.org on embryonix pdf/image files stored here

ZFIN Commons

The ZFIN Commons directory hierarchy contains files (e.g., scripts and documentation) that are used by the ZFIN team to build and manage ZFIN web sites and databases. (Note that these files are not used at runtime. The web site can be up and running if this directory is unavailable.)
ZFIN Commons Subdirectories
bin/ Binaries and scripts that are used to set up test web sites or the production web site, or that are otherwise useful for web site development.
doc/ ZFIN documentation, including this document.
env/ Scripts and data files that are useful for setting up development environments.

The bin directory is placed in your $PATH when you source the .env file for your ZFIN development environment. (Do not add the bin directory to your $PATH in your .cshrc file. If the filesystem containing the ZFIN Commons is down, then you won't be able to log in.)

The Commons directory and its subdirectories are managed as the Commons CVS project. There is a copy of the Commons directory on both the production and development servers. On both servers, the

/private/ZfinLinks/Commons

sym link points at the local copy of the Commons directory. Having a Commons directory on both servers ensures that the production web site can still be built, even if the development server is down.

See Updating Files in the ZFIN Commons for details on how to update files in the Commons directory.

See the Disk Usage section for where the 2 copies of ZFIN Commons are physically located.

ZFIN Prod

The directory

/research/zprod

is the production web site filesystem when the production web site is running on the production server.

See the Disk Usage section for where ZFIN Prod is physically located.

ZFIN Prod LoadUp directory structure

ZFIN Prod LoadUp is at:

/research/zprod/loadUp

Each of these subdirectories contains a /bkup directory. All directories in /loadUp/ and including /loadUp are owned by zfishweb:www. Zfishweb must own these directories, else the cgi that does the loading will fail. See the Apache configuration for more details.
ZFIN Prod LoadUp Subdirectories
/research/zprod/loadUp/imageLoadUp ZFIN production images stored here.
/research/zprod/loadUp/PDFLoadUp ZFIN production PDF files stored here.

Production copies of images and PDFs are stored in these directories.

ZFIN LoadUp File Naming Conventions

Files are named by their home-table ZDB-id in /research/zprod/loadUp/imageLoadUp and /research/zprod/loadUp/PDFLoadUp. In addition, the type of file (for images) is specified with For example, images from fish_image table are named like:

ZDB-IMAGE-010101-1.jpg (for images)
ZDB-IMAGE-010101-1_thumb.jpg (for thumbnails)
ZDB-IMAGE-010101-1_annot.jpg (for images with annotation)

ZFIN Users

The directory

/research/zusers

has a subdirectory for each ZFIN developer. In general, developers do their ZFIN related work in these subdirectories. The source tress for the web sites that a developer is working on are generally found under their users directory. There are several subdirectories under users that do not correspond to particular ZFIN developers.
ZFIN Users Special Subdirectories
almost Holds the source tree for the preproduction/sandbox site.
bionix Holds the source tree for the bionix.cs.uoregon.edu web site. This site is only used when testing software upgrades on bionix.
embryonix Holds the source tree for the zfin.org web site, when it resides on the development server, which is rarely.
helix This is a symbolic link to /research/zprod/users/zfin.org, which is the source tree for the zfin.org web site when it resides on the production server, which is most of the time.
mirror This is a symbolic link to /research/zprod/users/mirror, which is the source tree for the internal ZFIN mirror site, mirror.zfin.org. However, when the zfin.org web site resides on the development server, so does mirror.zfin.org. When that happens, this is not a sym link, but is rather the source directory for mirror.zfin.org. See the Mirror Sites section for more details.

See the Disk Usage section for where ZFIN Users is physically located.

ZFIN Unloads

The directory

/research/zunloads

stores recent unloads of the production database that were created with the unloaddb.pl script. As of 2004/11 we can fit about 2 1/2 months of dumps in this filesystem before running out of space. We move the last dump of each month from this directory to the ZFIN Archive filesystem for long term storage.

See the Disk Usage section for where ZFIN Unloads is physically located.

ZFIN Archive

The ZFIN Archive contains archival files that are no longer actively used in ZFIN, but that we can't quite bring ourselves to delete. It also contains long term backup data, and past Apache logs for the production web site. The ZFIN Archive is spread across several filesystems. However it is all available, either directly or through sym links, from

/research/zarchive0

Here is a list of some important directories in the ZFIN Archive. Most of these directories have a README file that explains the contents of the directories.
ZFIN Archive Subdirectories
History/
  1996-2000-IllustraRunning
Archival files from the early years of ZFIN when it was running on zfishstix and using the Illustra DBMS. These files reflect the state of the site before it was ported to Informix and moved to chromix.
History/
  2000-InformixPort/
Archival files from the port of the ZFIN web site from Illustra to Informix and simultaneously move it from zfishstix to chromix. This port happened on 2000/03/15, although it took a year of preparation prior to that.
History/
  2000-2001-InformixRunning/
Archival files from the first 10 months of ZFIN running under Informix on Chromix. This also contains files that were used on genetix.
History/
  2001-InfrastructurePort/
In late 2000 and early 2001 all of the files that make up the web site were moved under a single CVS project, and makefiles were added to create the entire web site. This process started on 2000/12/19 when the files were put under CVS and the new test server, bionix, was brought up. It was finished on 2001/03/30 when the production server was upgraded from Solaris 7 and Informix 9.20 to Solaris 8 and Informix 9.21.
History/
  2001-InfrastructureRunning/
Archival files from when the website was running using the new CVS/Makefile infrastructure.
databases/ Contains historical dumps of databases, mostly from production. The dumps were done with the unloaddb.pl script.
Logs/ Contains historical logs, mostly Apache logs from production. Tom.
users/ This directory has subdirectories for each ZFIN developer. These subdirectories contain files that the developers want to archive.

See the Disk Usage section for where ZFIN Archive is physically located.

/private

Each ZFIN server machine has its own locally mounted /private directory that contains "system" executables that are needed on ZFIN servers, but that are not part of the standard system installation. This includes things like Apache and Informix. See the System, Web Site, and Database Administration document for more details on /private.

/private/ZfinLinks

The /private/ZfinLinks directory exists on both the production and development servers. It contains symbolic links that point to important ZFIN directories that are either different on the two servers, or that switch when we move zfin.org from one machine to another. The sym links in this directory allow us to write generic scripts that will run the same no matter what server they are run on, and no matter where zfin.org resides. There is a README file in this directory that explains the purpose of each sym link, and what the value of each symlink is when zfin.org is on each server.

Databases

Each Informix engine stores its databases in a collection of raw disk partitions that it owns on the machine it is running on. You do not need to know where these partitions are located in order to use Informix. See the Disk Usage section for details on where the partitions actually are.

Web Site / Database / Machine Matrix

This section describes which machines, Informix engines, and databases support which web sites.

Machine: helix
Informix Engine: wildtype
C Shell Environment File: /research/zcentral/Commons/env/wildtype
Web Site Directory: /research/zprod/www_homes/{zfin.org,mirror}

Domain Name Database Purpose
zfin.org zfindb Production web site and database.
mirror.zfin.org none Used to test and support our mirror web sites. There is no database associated with this web site. The mirrors redirect all database access to the production web site. Our external mirror sites are copies of this web site. See the Mirror Sites sections for details.


Machine: embryonix
Informix Engine: wanda
C Shell Environment File: /research/zcentral/Commons/env/wanda
Web Site Directory: /research/zcentral/www_homes/VirtHostName

Domain Name Database Production Web Sites
zfin.org zfindb This is the production site on those rare occassions when the production site is running on the development server. This is reachable as embryonix.cs.uoreon.edu shortly before and after moving zfin.org to or from embryonix. The filesystem and database backing this web site exist during these rare times.
mirror.zfin.org none This is the internal ZFIN mirror site on those rare occassions when the production site is running on the development server.
Domain Name Database Preproduction Web Site
almost.zfin.org almdb This is the preproduction and sandbox site used as a final test platform for changes before they go into production. This also allows developers and curators to interact with a ZFIN database and/or web site without impacting the production server.
Domain Name Database Development Web Sites
albino.zfin.org clemdb Development web site, used by Dave Clements.
beaky.zfin.org tomdb Development web site, used by Tom Conlin.
coral.zfin.org kevdb Development web site, used by Kevin Schaper.
dino.zfin.org judydb Development web site, used by Judy Sprague.
edison.zfin.org yoldb Development web site, used by Prita Mani
frost.zfin.org none Unused.
gorp.zfin.org brockdb Development web site, used by Brock Sprunger.
hoover.zfin.org hoovdb Development web site, used by Sierra Taylor.
iguana.zfin.org iguadb Development web site, used by Kevin Schaper.
junior.zfin.org jrdb Development web site, used by Peiran Song.
kirby.zfin.org kirbdb Development web site, used by Judy Sprague.
lucky.zfin.org luckdb Development web site, used by Brock Sprunger.
manx.zfin.org mnxdb Development web site, used by Tim Mason for updates related to ZIRC
nagel.zfin.org nagdb Development web site, used by Ron Holland for updates related to ZIRC.
ogon.zfin.org ogodb Development web site, used by Peiran Song.
polka.zfin.org plkdb Development web site, used by Prita Mani.
quark.zfin.org quadb Development web site, used by Paea LePendu.
runzel.zfin.org ruzdb Development web site, used by Sherry Giglia.
swirl.zfin.org swrdb Development web site, used by Sierra Taylor.
tango.zfin.org tandb Development web site, used by Paea LePendu.
ukkie.zfin.org ukidb Development web site, used by Sergei Bogdanov for updates related to ZIRC.
viper.zfin.org vipdb Development web site, used by Xiang Shao.
whirly.zfin.org whrdb Development web site, used by Xiang Shao.
xray.zfin.org
yoyo.zfin.org
zezem.zfin.org
none As of 2005/10 these are unassigned.
test.zfin.org testzfinorgdb Used to do external beta testing of new ZFIN features.


Machine: bionix
Informix Engine: wavy
C Shell Environment File: /research/zcentral/Commons/env/wavy
Web Site Directory: /research/zbionix1/www_homes/bionix

Domain Name Database Purpose
bionix.cs.uoregon.edu biondb Test web site used only to test software upgrades. Most of the time this web site is not up and running.

ZFIN Development Environments

A ZFIN development environment is a complete copy of the ZFIN web site and ZFIN database. Each environment has a source directory tree where development is done and a target directory tree, which is the web site itself and which is produced by makefiles in the source tree. Each development environment has its own URL. Each developer has one or more development environments, depending on the projects they are working on at the time. Each development environment is tied to one CVS branch.

Environment Variables

A plethora of environment variables must be set to do ZFIN development. These variables can be broken into two broad categories: Informix variables and ZFIN Makefile variables. Informix variables are those that need to be defined in order to access an Informix database. Makefile variables are additional variables that must be set to use the ZFIN Makefiles and/or to check files out of CVS.

.env Files

For each development environment, there exists a .env file which can be sourced to set all of the environment variables for that environment. The .env files are in:

/research/zcentral/Commons/env/VirtHost.env

These set the CVSROOT, Informix, and Makefile environment variables that are needed for the particular environment. The variables themselves are discussed in the following sections.

To set the all the environment variables in your shell, source the script for the virtual host/development environment you want to use. For example:

% source /private/ZfinLinks/Commons/env/albino.env

sets your environment variables for the albino development environment on embryonix. If you get tired of typing all that all the time, you can set up an alias in your ~/.cshrc file:

alias srcalbino source /private/ZfinLinks/Commons/env/albino.env

and then at your shell prompt you could just type

% srcalbino

to set up all your environment variables.

Informix Environment Variables

Before you can access a database in any Informix engine, you must first set a number of Informix environment variables. These variables must be set:
Key Informix Environment Variables
VariableDescription
INFORMIXDIR Absolute path of directory where Informix engine is installed and is running.
INFORMIXSERVER Name of the Informix server/engine.
ONCONFIG Name of the onconfig file for the Informix engine. This contains the configuration of the server.
INFORMIXSQLHOSTS Absolute path of the SQLHOSTS file for the Informix engine. Used by Informix to route connections to the right engine.

In addition, you must also modify the settings of your PATH and LD_LIBRARY_PATH environment variables to include Informix binaries and libraries. These are the only environment variables that are required, but there are several others that can affect how the server runs. See the Informix Settings section for details.

The .env files described in the previous section set these variables for you. If you want to set only the Informix environment variables then several C Shell scripts have been created to set these environment variables for you. They are in

/private/ZfinLinks/Commons/env/InformixServerName

To set the Informix environment variables in your shell, source the script for the Informix engine you want to use. For example:

% source /private/ZfinLinks/Commons/env/wanda

sets your environment variables for the wanda Informix engine on Embryonix. If you get tired of typing all that all the time, you can set up an alias in your ~/.cshrc file:

alias srcwanda source /private/ZfinLinks/Commons/env/wanda

and then at your shell prompt you could just type

% srcwanda

to set up your Informix environment variables.

You must be logged on to the machine the Informix engine is on in order for the environment variables to be effective. The shell scripts also change your prompt to let you know what Informix engine you are currently setup to use.

Makefile Environment Variables

These environment variables must be set in addition to the Informix environment variables, in order to use the ZFIN makefiles.
Makefile Environment Variables
Variable Description
CVSROOT Tells CVS where source files are. This should always be set to /research/zcentral/Vault/CVSroot.
DBNAME Name of the database in $INFORMIXSERVER to work with. Each developer works with their own database(s). This must agree with the <!--|DB_NAME|--> value in the translate table file. See the Web Site / Database / Machine Matrix for which database goes with which web site.
TARGETROOT The directory where the makefiles put their output. This identifies the root of the target web site. This is the specific version of the tree. The makefiles reside in the generic version. This must be an absolute path and is always of the form /research/zcentral/www_homes/VirtualHostName.
TARGETCGIBIN Name of the cgi-bin directory to use. This name is relative to $TARGETROOT and must agree with the <!--|CGI_BIN_CIR_NAME|--> value in the translate table file. It typically has the form cgi-bin_VirtualHostName.
TARGETFTPROOT Full path to the FTP directory for this site. This must be an absolute path and must agree with the <!--|FTP_ROOT|--> value in the translate table file.
TRANSLATETABLE The file containing the translate table to use when converting generic files into their specific counterpart. This must be an absolute path. A set of predefined translate tables are defined in the ZFIN Central Commons directory with the names: /private/ZfinLinks/Commons/env/VirtualHostName.tt

Makefiles and the ZFIN_WWW Source Tree

The entire web site is produced by a single makefile hierarchy. All web site development is also done within that hierarchy. The makefiles all conform to a particular look and feel. Unfortunately, they are also complicated and you need to understand a fair amount about makefiles to modify them. The makefiles are written for gmake (a.k.a., GNU Make, a variant of the standard Unix make utility. If you have questions about gmake, see the GNU Make Manual or talk to Dave C.

There is a lot of documentation in the makefiles. If you have questions about a particular file then the makefile that produces it is a good place to start.

The ZFIN web site source tree contains everything (almost) that goes into making a ZFIN web site. It is kept in CVS under the ZFIN_WWW project. Each developer gets their own copy of the source tree and makes and tests changes in their copy of the tree before those changes are checked back into CVS, tested on the preproduction site (almost.zfin.org), and then posted to production.

Each directory in the source tree also has its own makefile, and all the makefiles cooperate to form one coherent makefile hierarchy.

Generic vs. Specific

The makefiles reside in and get their source files from generic or source directories. They put their output files in specific or target directories. Files in the generic directory are, well, generic. They have had all references to specific databases, directories, Informix engines, and machines replaced with equivalent generic tags.

The specific/target directory approximately mirrors the generic/source directory, but it contains specific versions of the files where the generic tags have been replaced with references to actual databases, directories, Informix engines, and machines.

The specific/target directory tree is populated by the makefiles in the generic/source directory tree. The makefiles use a translate table file to know what specific values to replace the generic tags with. Which file to use for the translate table is determined by the $TRANSLATETABLE environment variable. Standard versions of translate table files for each ZFIN web site environment can be found at:

/private/ZfinLinks/Commons/env/VirtualHostName.tt

Generic tags have the form

  <!--|generic_tag_name|-->

For example, <!--|DB_NAME|--> is the generic tag for a database name.

The makefiles use the makespecific.pl script (which calls the makespecificworker script) to translate a file from its generic form to its specific form. Typically, makespecific.pl is called so that its output file (the specific file) is placed directly in the final output directory under the $TARGETROOT directory. This is what happens with HTML and app page files.

However, files such as C or Java source code files cannot go directly into the $TARGETROOT directory hierarchy. In these cases a staging directory is used.

Recursion

The makefiles form a recursive hierarchy. Invoking gmake in any directory invokes the makefile in that directory and the makefiles in all that directory's subdirectories. This allows you to create an entire web site directory structure by typing a single command. On the down side, it also means that if you are changing something small in a high level directory then every time you run gmake you will have to wait for it to step through all that directory's subdirectories (determining that there is nothing to do in those subdirectories, which is quick, but still takes time).

Makefile variables are not passed from parent to child makefiles.

Targets

Targets are what makefiles create. They either correspond to real individual files, or to phony targets which may correspond to a set of files or to a particular task for the makefile to carry out. You invoke a particular target by specifying it in the gmake command. For example,

% gmake clean

There are a standard set of targets that appear in all makefiles.

Standard Makefile Targets
TargetDescription
all The first target in each makefile. Therefore it is also the default target. Causes all the files in this directory and its subdirectories to be made. Put another way, this creates the output of the makefile and all its child makefiles.
clean Remove intermediate, temporary, and working files in this directory and in all subdirectories. In many directories there are no local intermediate, temporary or working files to remove.
clobber Removes the target files produced by the makefiles, in this directory and in all subdirectories. It basically removes all files from target directories, but does not remove the target directories themselves. This sometimes also removes the things in your database that are put there by the makefiles themselves. This includes the app pages in the webpages table, the database functions defined in lib/DB_functions, and the contents of the EXECWEB table.
sanitycheck Performs a sanity check on files in the directory and all its subdirectories. As of 2004/08, this
  • Runs the Informix weblint program against any app pages in the current directory tree.
In the future it would be very useful if it also
  • Checked STATIC files for presence of generic tags
  • Checked STATIC & GENERIC files for presence of specific values that should be replaced with generic tags
  • Checked app page source files against what is actually loaded into the database.
onetimeonly This target currently (2001/06) doesn't do anything. However, at some point in the future it may be used to make system wide changes to files in all directories in the makefile tree. When the infrastructure was created a similarly structured target was used to do the initial translation of files from the specific form they had existed in to the generic form that is now checked into CVS.

In addition, there are several targets that occur in the top makefile and in only a few other makefiles below it. These do not propagate throughout the whole tree but only to portions of it.
Additional Makefile Targets
TargetDescription
mirror Used to create a web page hierarchy that is then copied by the mirror sites. This web page hierarchy contains only the static web pages. This target is present only in high level makefiles where some, but not all, of the subdirectories are going in to the mirror.
postloaddb This target should be invoked after calling loaddb.pl to load a database into an already existing development environment. loaddb.pl loads everything, or almost everything, into the DB. This includes

After a load all of these things effectively point back to the database that the data was originally unloaded from, usually production. Making the postloaddb target causes all of these things to be reloaded with definitions that are appropriate to the local DB. The all target must have been made previously.

start Start processes. Starts up any processes that need to be running for the environment to work. As of 2003/01, this target doesn't actually start or stop anything. The database engine and apache are not controlled by this. The all target must have been made previously.
stop Stop processes. Stops any processes started by the start target.

Makefile Variables

Makefile variables are distinct from environment variables. They are declared inside makefiles and do not exist outside of them. Unless extra steps are taken, makefile variables are not passed from parent to child makefiles. The set of variables defined in each makefile depends on what the makefile is doing.

There are makefile variable naming conventions, and makefiles that do similar things have the same sets of variables. In general, if a makefile has a need for one of the variables described below, then the variable is given the name listed below.

Makefile Variables
NameDescription
TOP Relative path from this directory to the root of the generic directory. This is basically an indicator of how deep in the tree the current directory is. Must be defined before make.include is included. TOP is defined in every makefile.
Example:
  TOP = ../../..
SUBDIRS Subdirectories of the current directory that also contain makefiles.
TARGETDIR Identifies the directory within $TARGETROOT where the final output files produced by the makefile will go. Has the form $(TARGETROOT)/subdirectory. The subdirectory part of that usually mirrors the name of the subdirectory the makefile is in.
Example:
  TARGETDIR = $(TARGETROOT)/home
GENERICS List of generic files that the makefile will translate into specific files.
Example:
  GENERICS = classify_pubs.apg do_direct.apg
STATICS List of files that don't need to be translated from a generic form into a specific form. These files don't contain anything that needs translation.
Example:
  STATICS = fish_bgd.gif fish_net.gif
SPECIFICTARGETS List of specific versions of generic files. Depending upon the nature of the generic file, these may or may not be the final target files. For app pages and HTML files these are usually the final targets, but for Java and C files they are intermediate files. See discussion of staging directories below.
STATICTARGETS List of targets that are based on static files.
ENDEMICTARGETS_PRE
ENDEMICTARGETS_POSTTARGETDIR
ENDEMICTARGETS_POSTTARGETS
ENDEMICTARGETS_POST
These 4 variables are used with targets that require special handling in the local makefile. These variables allow those makefiles to have special processing for these targets and still use the default rules for other targets. Which of these varaibles a target goes into determines when that target will be made in relation to the default targets.
TARGETS List of all targets produced by the makefile. In other words, this is the list of final output files produced by the makefile.
Example:
  TARGETS = \
    $(SPECIFICTARGETS) $(STATICTARGETS)

Further documentation on all of these variables is provided in the makefile at the top of the ZFIN_WWW tree.

If a makefile uses either of the default rules makefile include files then the makefile must use the standard variables names.

Makefile Include Files

In the top directory of the source tree there are three files that can be included in the makefiles:

Include File Purpose
make.include This file is included in every makefile immediately after the TOP variable is defined. It does several things:
  • Checks that gmake (or gnumake) is being run and not make. These makefiles make use of several features that are not in the standard make.
  • Checks that all needed environment variables are set. If they aren't then the make is aborted.
  • Defines which specific executables and options are to be used in the makefiles, For example, this defines what command and options are to be used to copy a file to its target directory, and where to find makespecific.pl.

Because this file is included in every makefile, these checks are made and the variables are defined in every makefile.

make.defualt.rules This defines a default set of targets, variables and rules for directories that do not contain app pages. Many directories that contain just regular HTML and/or image files can use the standard set of rules in this file. If this is included then no rules generally need to be defined in the makefile including it.
make.default.apg.rules This defines a default set of targets, variables and rules for directories that do contain app pages. This is almost identical to make.default.rules, but varies slightly in the rules.

The use of the make.default files greatly reduces the length of makefiles. About 90% of the makefiles include one of the two make.default files. This means that all the makefiles have a similar look and feel. Makefiles that don't use the default rules tend to be the more complicated ones that involve compiling code files. It also makes system wide changes to the makefiles much easier to make: We only have to modify the 2 make.default files and the 10% of the makefiles that don't use either of them.

The make.default files use the standard set of makefile variables. Any makefile that includes either of them must use the standard set of variables. This forces most makefiles to have a similar look and feel.

Staging Directories

Staging directories are needed when all of these conditions hold true:

In such cases the process for producing the final file that goes into the web site is at least a 2 step process involving at least 3 levels of files. First, the source file is converted to its specific version, with all the tags being replaced with values for the target server. Then the specific version is compiled into an object or class file which is then copied to the web site directory, either as a standalone file or as part of an archive or executable. Staging directories are used to hold the files after the first step of this process. Staging directories hold intermediate versions of files that are specific to a particular web site.

All staging directories reside under the Staging subdirectory directly under the $TARGETROOT directory. Each source directory that needs a staging directory has its own subdirectory under $TARGETROOT/Staging.

Originally, staging directories existed in the source tree. This led to a number of problems. First, it made it very difficult/impossible to produce multiple destination directories from one source directory. We don't often do this, but it is occassionally a handy ability to have (e.g. when producing the beta test site). Secondly, having staging directories in the source tree led to lots of noise when you ran cvs update. All of the intermediate files would show up in the output and you could easily loose the important stuff.

Source Code Control and CVS

Most of what is used to produce the web site is kept under CVS, a source code/revision control system. All ZFIN source code controlled files are kept under one CVS root:

/research/zcentral/Vault/CVSroot

The source code and web pages that go into the site are stored in the ZFIN_WWW project of CVS. Projects are the largest logical grouping of files under CVS. When a developer wants to create a test web site they first check out the ZFIN_WWW project from CVS (typically into somewhere under their /research/zusers directory).

The ZFIN Commons is also under CVS control. The ZFIN Commons directories are just checked out versions of the Commons CVS project.

CVS is a big product. It has lots of commands, options, and environment variables that can be used to achieve various outcomes. Most of them you will not need to know. The CVS Manual is over 180 pages long. There is also a web site, www.cvshome.org that is dedicated to CVS. Hopefully, this document will discuss most of what you need to know to use it.

CVS Commands

Before you can use any CVS commands you must first set the CVSROOT environment variable to the directory that contains the ZFIN CVS projects:

% setenv CVSROOT /research/zcentral/Vault/CVSroot

All of the .env files set CVSROOT. You can also add the above line to your .cshrc file if you want.

All CVS commands have the form

cvs [ global_options ] command [ command_options ] [ command_args ]

All CVS commands are recursive by default, although this can be overrriden for most commands.

The table below lists the most useful commands, with their most useful options. This document does not cover all CVS commands or options. See the CVS Manual if this is not enough information for your needs.

CVS Commands
Command Purpose
cvs add file_directory_list Adds new files and/or directories to CVS. Tells CVS that the next time you commit, this file or directory should be added to the repository. Adding a file or directory does not in and of itself cause the added item to show up in the repository.
Please talk to Dave C before adding any new directories.
cvs annotate files Use the annotate command to find out in what revision each line in the file was most recently modified.
See also CVSweb.
cvs checkout [-P] cvs_project Called to create a copy of a CVS project in your local directory. This is generally called only once, when you first create your copy. After that you use the update command to update your source tree. Checkout should always be run in the directory where you want the ZFIN_WWW directory to be created. See CVS Common Command Options below for an explanation of the -P option.
cvs commit [files] Commits changes in your files to the CVS repository so that others can see them. If no files are specified then all of the changed files in this directory and all subdirectories are committed to the repository. If files are specified then only those files are committed. If others have updated any files in directories you are committing then CVS will not let you commit until you have gotten those updates from CVS using the update command.
Note that this does not in and of itself propagate the changes to the production web site. However, it does start this process.
cvs diff [files] Compares the files in your directories with the files currently stored in CVS. Useful for determining what changes you have made since the last time you updated your files.
See also CVSweb.
cvs diff -rrev1 file Compares a file in your directory with a specific revision in CVS.
See also CVSweb.
cvs diff -rrev1 -rrev2 file Compares two different revisions of a file in CVS.
See also CVSweb.
cvs edit files Tells CVS that you intend to update the given file(s). CVS responds by registering that fact and giving you a writeable copy of the file(s).
cvs editors files Lists the developers who are currently editing the given files.
cvsinfo [files] List the latest and working revision numbers, and the CVS status of files in a compact format.
Note: This is not part of the standard CVS package. It is a locally written script (that is why the command is one word instead of two). It does not operate recursively.
See also CVSweb.
cvs log [files] List the update history of files.
See also CVSweb.
cvs remove files This is the complement of the add command. It tells CVS that you want to remove the given files from CVS. The files won't actually be removed until your next commit command. (And even then CVS doesn't really remove the file, it just stops giving it to developers.) You must first remove the file from your working directory before you can remove it from CVS.
cvs status [files] This displays information about the current status of files, such as what revision you have, if others have updated it since you got it, and whether or not you have updated it. It is similar to but different from the log command. The log command displays the update history of a file, but not its current status.
See also CVSweb.
cvs unedit files You use the edit command to tell CVS to give you a writable version of a file so you can update it. You then normally use the commit command to commit those changes to the CVS repository. The unedit command is used in the cases where you decide that you want to discard the changes you have made, rather than commit. This unregisters you as an editor of the file in CVS, and restores your file to its unedited state.
% cvs unedit old_file.html
cvs [-nq] update [-dP] [files]

cvs update -D  [date-of-version-you-want]  [file-name]

cvs update -p -rrev file
Updates your working directories so that they are current with what is checked in to CVS. This basically brings you up to date with the changes that other developers have done. If you have made changes to a file in your directory, and you haven't committed those change yet, and another developer has committed changes to that file then CVS will automatically merge your changes with those from the other developers.

Options
-n See CVS Global Options below.
-q See CVS Global Options below.
-d The update command will not by default pick up new directories that others have added and that are not yet in your tree. In order to get the update command to also pull in any new directories that have been added by others you must specify the -d option.
-D To get a previous version of a file as a working copy. See example below.
-P See CVS Common Command Options below.
-p -rrev The -p and -r options are used to pipe the specified revision of the file to STDOUT. You can then redirect that to a file. This is the only safe way to get a copy of a previous revision of a file without setting a sticky tag for the file. And trust me, you do not want to set the sticky tag for a file.
-A (taken from CVS Manual) "Sometimes a working copy's revision has extra data associated with it, for example, it might be on a branch, or restricted to versions prior to a certain date. Because this data persists -- that is, it applies to subsequent commands in the working copy -- we refer to it as sticky. You can use the status command to see if any sticky tags or dates are set: "cvs status [filename]". Sticky tags will remain on your working files until you delete them with cvs update -A. The -A option retrieves the version of the file from the head of the trunk, and forgets any sticky tags, dates, or options. Again, avoid sticky tags if at all possible."

cvs watch add [files] If you are really concerned about changes to a particular file, you can use this command to tell CVS to send you an e-mail whenever anyone edits, unedits, or commits the given file.

CVS update -D example:

(assuming you're in your CVS folder--change the date to the appropriate revision):

% cvs update -D 2004/02/17 00:45:57 anatomy.obo this makes *your copy* of anatomy.obo the one from 2004/02/17 00:45:57

% cvs update -A anatomy.obo clears this old version, and makes your copy the last-checked-in version of anatomy.obo in you folder.

This means the update statements above are over-writing your current version (again, save or check in your version if you've made changes that aren't checked in yet).

Note: this makes your file a working copy, checkins of the previous version/working version after reversion will result in checkins to CVS and changes in production; be careful when doing this and be sure to ask your collegues questions before proceeding!

CVS Global Options

There are a set of options that can be applied to all CVS commands and have (more or less) the same meaning in all contexts. These are called global options in CVS. They occur in the CVS command immediately after the cvs.

cvs [ global_options ] command [ command_options ] [ command_args ]

This table lists some of the more useful CVS global options and how you might use them.

OptionDescription
-H Display usage information about the specified CVS command
-n From the CVS Manual:
"Do not change any files. Attempt to execute the `cvs_command', but only to issue reports; do not remove, update, or merge any existing files, or create any new files. Note that CVS will not necessarily produce exactly the same output as without `-n'. In some cases the output will be the same, but in other cases CVS will skip some of the processing that would have been required to produce the exact same output."
This is useful with the update or checkout commands when you want to find out what others have changed and what you have changed without actually importing others' changes into your files. For example:
ZFIN_WWW % cvs -nq update U make.default.apg.rules U make.default.rules ...
indicates that make.default.apg.rules and make.default.rules have been updated by you but not yet committed.
-q Cause the command to be somewhat quiet; informational messages, such as reports of recursion through subdirectories are suppressed.

CVS Common Command Options

Each CVS command has its own set of command options. However, a subset of those options are common to all or most of the CVS commands. Command options follow the cvs command, but precede the command's arguments:

cvs [ global_options ] command [ command_options ] [ command_args ]

This table discusses some of the more useful common command options.

Option Description
-l Run the command in the current directory only. CVS is recursive by default. Without the -l option CVS will apply the command to the current directory and then recursively to all of its subdirectories. This may generally be the right thing to do but it can be tedious when you have changed only one local HTML file.
-P Prunes empty directories. Files can be (and are) removed from CVS cleanly. However, empty directories never really go away. To tell CVS that you don't want to see empty directories as a result of update or checkout commands.

CVSweb

CVSweb is a GUI that makes CVS repositories readable over the web. The results from many of the commands listed in the CVS Commands section can be viewed using CVSweb. The ZFIN CVSweb site is at http://cvs.zfin.org/cvs/cvsweb.cgi/. It is only accessible if you are are connected to the University of Oregon network.

We don't have any documentation for CVSweb. Fortunately, it is fairly self-explanatory. Contact Dave C if you have problems with it.

CVS, Locking, and Multiple Developers

CVS uses RCS for its underlying file storage. Therefore you will occasionally see things from RCS popping up in CVS. However, CVS has many fundamental differences with RCS, and one of those fundamental differences is its approach to supporting multiple developers.

The RCS Approach

RCS requires developers to first get an exclusive lock on a file before they can update it. Only the developer that holds the lock can update a file and that developer holds the lock until their changes are checked in, or until they release the lock. This clearly prevents 2 developers from making conflicting changes to the file. Whoever gets the file last can't start making changes to the file until they get a lock on it, and when they do get a lock on it, the version of the file they get will have the earlier developer's changes already in it.

The down side of this is that forces development to run serially, rather than in parallel. Most often developers will be making complementary rather than conflicting changes to a file. In such cases there is no need for RCS's exclusive locking model. When developers start changing a file they may not have a clear idea when that change will actually be committed (the CVS term) or checked in (the RCS term). It may be that another developer needs the file for a quick change and can't get it. Under RCS such changes involve negotiation. The developer who has the lock has to release it so the second developer can get it. The second developer then makes their change and checks it in. The first developer then checks it out again and has to figure out how to re-apply their changes to the now modified file. It's complicated.

The CVS Approach

CVS assumes that development should occur in parallel whenever possible. Its default assumption is that two developers working on the same file at the same time are likely to be making complementary changes to the code, rather than conflicting ones.

Developers do not exclusively lock files in CVS. Rather they register their intent to modify a file by using the edit command. The edit command gives the developer a copy of the file with write permissions and registers in CVS that the developer has an updateable copy of the file.

If you want to see what other developers are editing a file, you can run the editors command to list who has done an edit command on a file, but has not yet committed their changes. If you are really concerned about changes to a particular file, you can tell CVS with the watch add command to send you an e-mail whenever anybody issues an edit command on that file.

Conflict Resolution in CVS

CVS's parallel development model does allow for the possibility of conflicting changes being made by multiple developers on a file at the same time.

The update and checkout commands have the possibility of merging your changes with those of other developers. These commands can merge other developers' changes with your changes in your updated (but as yet uncommitted) copy of the file. Both commands have the potential for conflicts. The commit command will not work if it detects and conflicts.

There are two types of conflict we need to worry about:

The way CVS deals with physical conflicts depends on the command being used.

Update and Checkout:

The update command brings other developers' committed changes into the files in your directories. The checkout command also does this if used on an already existing project directory. These commands bring you up to date with what's changed in CVS since the last time you ran an update or checkout command. There are several possible cases that can occur when you update/checkout a file that you have modified locally, but haven't yet committed.

Commits:

The commit command commits changes that you have made in your copies of files into the CVS repository. It brings the CVS repository up to date with what you have been doing, and therefore makes your changes visible to others. Commits, like updates have the potential for conflicts. Here are the cases:

Generally, before you do a commit, you should first do an update, look for any reported merges, and then examine and test the changes that were merged. Once that has been done, you should repeat the process until no merges are reported by the update. Then, and only then, should you commit your changes.

CVS and Large Projects

Part of the work we do in ZFIN comes in relatively small chunks and has a short lifespan from the start of the work to the end of the work. Such work involves only 1 developer and a relatively small number of files.

However, we often do projects that span more than a short period of time or that involve more that one developer. On the small end of the spectrum there are projects where one developer makes almost all of the changes, except for one file that is modified by a different developer. On the other end of the spectrum are projects that span many months, affect many files, and involve most or all of the ZFIN development staff.

The approaches discussed in previous sections will not work in these situations. Previous discussion assumed that changes were being made for single developer projects. In those cases, developers can hang onto all changes until they are ready to check them in all at once. The changes are then propagated to production shortly after that. Other developers can then pick up the changes as well. In this model, what's in CVS and what's in production are always in close agreement with each other. The only coordination issues that arise are making sure that others' independent projects haven't interfered with your changes.

However, coordination issues can get complicated whenever one of these conditions occur:

In the first case the problem is how do different developers communicate their changes to each other, without also placing them in CVS (and therefore production) before the changes are ready to go into production. In the second case the problem is how does the developer keep the two projects separate, such that when one set of updates is committed, all of the updates for that project are committed and none of the updates for the other project are committed.

We have several possible methods for doing these types of coordination. Deciding which one to use for a given project is a function of personal choice and how complicated the project is. More complicated projects require the more complicated measures.

Manual Coordination of Updates for Multiple Developer Projects

Updates can be manually coordinated for small projects where the bulk of the work is done by a single developer and other developers contribute only a few files to the process. In such cases it is easiest if only the main developer ever edits anything in CVS or commits changes to CVS. The main developer can either give others write access to the files they need to modify, or the other developers can modify copies of the files and then send them to the main developer. If the other developers are modifying their own copies of the files and then sending them to the main developer, then they should not be doing these modifications inside their CVS tree, or, if they are, then any modified files should be given different file names.

Manual Coordination of Updates for a Single Developer with Multiple Projects

Often the easiest way to deal with this situation is to assign the developer a different web site for each project. The developer then creates a separate ZFIN_WWW source tree for each project. The developer then only has to remember which source tree and web site go with which project.

This approach works well in CVS, except that the CVS edit/editors mechanism gets confused. This does not cause any hard problems but it may cause you not to be listed in the output of the cvs editors command when you should.

Automatic Coordination of Updates and CVS Branching

Tom, can you check this?

The manual methods described in the previous sections are useful, but won't scale well as the number of developers involved increases. This section describes how we can use CVS to deal with large projects.

Also see pages 120-130 of Open Source Development With CVS for an alternate and more complete explanation of this material.

CVS supports branching. This means that there can be a mainline version of the code and also one or more divergent versions of the code. All of our discussion so far has assumed that we were working on the mainline. If you don't do anything special in CVS, then by default you will be using and updating the mainline version of the code.

ZFIN uses CVS branching whenever we start a project that we suspect will involve multiple developers updating multiple files, or that we suspect will span a significant amount of time between start and finish.

The process for creating a new branch is to talk to Dave and:

To actually get the source files in that branch you will need to checkout a brand new copy of the ZFIN_WWW source file hierarchy. This should always be done into a different location than your mainline ZFIN_WWW directory. Note that CVS will quite happily let you checkout the branch files right on top of your mainline files. However, if you do this, you will no longer be able to update any mainline files because you no longer have access to them. If you check the branch out into a different directory then you can still access both.

The process for creating and populating your web site are almost identical to those in the Creating a ZFIN Development Environment, An Example section.

The notable difference is that when you checkout the CVS tree, you should provide the -r option and tell it which branch you want. For example.

% cvs checkout -r b2001-02-13-zmap ZFIN_WWW

Other than that, the steps laid out for the mainline will also work for branches.

This will create a ZFIN_WWW directory that contains versions of files from the given branch (b2001-02-13-zmap in the example). From this point on, any CVS commands you do in the new ZFIN_WWW tree will be on file versions from that branch (with the possible exception of cvs add commands).

This means that when you cvs edit and commit files in this tree the changes will take place only in the branch; they will not affect the mainline code. This allows developers to send updates to each other in a controlled fashion, without also putting the updates into the mainline (and thus also into production).

The version numbers of files look different for branches than they do for the mainline code. In ZFIN, the mainline always has the form 1.x, where x represents the number of times the file has been updated since it was first checked in. Branch file version numbers typically have the form 1.x.0.y where 1.x is the file version in the mainline that the branch was first created from, and y is the number of times this file has been revised in this branch.

gmake will also operate on the branched versions of the files.

Keeping up with the Joneses

Work on the mainline doesn't stop when a branch is created. At periodic intervals during the life of the branch, you should incorporate the mainline changes into your branch. This prevents you from getting too out of date. It is generally a bad idea to wait until you are ready to merge your changes back into the mainline to do this. Doing this step early and often will minimize the amount of work you have to do when you get to that step.

To get the latest changes from the mainline into your branch, do the following.

  1. CD into your branch ZFIN_WWW directory and source the .env file for the branch site.
  2. Run cvs update with the -j HEAD option
    % cvs -q update -j HEAD

    This will cause any files that were modified in the mainline to be merged with the branch and into your working directory.

  3. Investigate any conflicts that were reported by the update and resolve them. Test the changes.

    tips for merging:

    • If you find a conflict in your branch with a file you know wasn't changed, just bring in the latest copy from the mainline using:
      % cvs update -p -r HEAD > file.html.new % mv file.html.new file.html

      The stdout redirect is important, because you don't want a sticky tag. More information in the CVS Commands section.

  4. Commit the files that were updated into your branch.

Merging Branches Back Into the Mainline

Eventually, a project will be ready for prime time. All of the necessary updates will have been done, tested, and committed to the project's branch in CVS. It is now time to merge the branch back into the mainline and put it up in production.

At this point the lead developer for the project will need to merge the branch into the mainline using their mainline copy of the ZFIN_WWW hierarchy. To do this:

  1. CD into your mainline ZFIN_WWW directory and source the .env file for your mainline site.
  2. Run cvs update with the -j option to merge in the branch. For our example branch this would be:
    % cvs -q update -j b2001-02-13-zmap

    This will cause any files that were modified in that branch to be merged with the mainline.

  3. Investigate any conflicts that were reported by the merge and resolve them. Test the changes.
  4. Commit the files that were updated by the merge and then propogate them to production.

There are several possible variations on this theme. For example we could tag the mainline immediately before the merge, and/or immediately after it. Time will tell what works best.

Creating a ZFIN Development Environment, An Example

This section walks through an example of setting up a ZFIN development environment. This explanation assumes that this is the first time the environment has been set up. Once you set up your environment, you will only occasionally want to recreate it again from scratch.

This example sets up albino for the albino.zfin.org URL. The steps here are the same as you would take to set up your own environment.

A subset of this material is described with graphics in the Creating a Test Web Site section.

The process:

  1. Setup your environment variables

    Each different environment has its .env file in the ZFIN Commons env directory. Source the appropriate file for your environment:

    embryonix [~]% source /private/ZfinLinks/Commons/env/albino.env albino ~%

    Or, since I have an alias defined to do just that command, I could have typed:

    embryonix [~]% srcalbino albino ~%

    The .env files set your CVSROOT, Informix, and Makefile environment variables to the values that are appropriate for the environment. They also modify your prompt to reflect which ZFIN development environment you are using.

  2. Load data into your database

    I'm going to load the data from production that was unloaded on 2000/12/19 into albino's database, clemdb.

    albino ~% loaddb.pl -ee clemdb /research/zunloads/databases/zfindb/2000.12.19.1 Fri Dec 22 13:30:26 PST 2000 Dropping old database (if it exists)... Fri Dec 22 13:30:42 PST 2000 Defining new database... Fri Dec 22 13:30:53 PST 2000 Creating list of tables to load... Fri Dec 22 13:30:53 PST 2000 Creating preload and postload scripts... Fri Dec 22 13:31:22 PST 2000 Disabling indexes, constraints, and triggers... Fri Dec 22 13:31:23 PST 2000 Loading data into database... Fri Dec 22 13:34:01 PST 2000 Enabling indexes, constraints, and triggers... Fri Dec 22 13:35:42 PST 2000 Enabling logging... Archive to tape device '/dev/null' is complete. Program over. WARNING!!!! You MUST now cd to your ZFIN_WWW directory and type WARNING!!!! WARNING!!!! % gmake postloaddb WARNING!!!! WARNING!!!! Failure to do this results in very unpleasant WARNING!!!! behavior in your web site, and the ire of all WARNING!!!! your coworkers. albino ~%

    Note the warning at the end. We will take care of this in the Make the postloaddb target step below.

  3. Get to the directory you are going to put your source tree in.

    For your own environments you'll want to cd to /research/zusers/your_login.

    albino ~% cd /research/zusers/clements/Projects/Zfin


  4. Checkout the ZFIN_WWW project into your directory.
    albino Zfin % cvs checkout -P ZFIN_WWW cvs checkout: Updating ZFIN_WWW U ZFIN_WWW/Makefile U ZFIN_WWW/make.default.apg.rules U ZFIN_WWW/make.default.rules U ZFIN_WWW/make.include cvs checkout: Updating ZFIN_WWW/cgi-bin U ZFIN_WWW/cgi-bin/Makefile U ZFIN_WWW/cgi-bin/SuperWebdriver
    About 1500 lines deleted here.
    U ZFIN_WWW/server_apps/sysexecs/Makefile cvs checkout: Updating ZFIN_WWW/server_apps/sysexecs/encryptpass U ZFIN_WWW/server_apps/sysexecs/encryptpass/Makefile U ZFIN_WWW/server_apps/sysexecs/encryptpass/encode.c U ZFIN_WWW/server_apps/sysexecs/encryptpass/encryptpass cvs checkout: Updating ZFIN_WWW/server_apps/sysexecs/make_thumbnail U ZFIN_WWW/server_apps/sysexecs/make_thumbnail/Makefile U ZFIN_WWW/server_apps/sysexecs/make_thumbnail/make_thumbnail.sh albino Zfin %

    You now have the latest and greatest version of all the files in CVS that support the ZFIN web site.

  5. Make your web site.

    This step will create the files that are in your web site, and load your app pages and database functions.

    albino Zfin % cd ZFIN_WWW/ albino ZFIN_WWW% gmake gmake -C home gmake[1]: Entering directory `/research/zusers/clements/Projects/ZFIN_WWW/home' mkdir -m 755 /research/zcentral/www_homes/albino/home; /private/ZfinLinks/Commons/bin/makespecific.pl index.html \ /private/ZfinLinks/Commons/env/albino.tt \ /research/zcentral/www_homes/albino/home/index.html /private/ZfinLinks/Commons/bin/makespecific.pl robots.txt \ /private/ZfinLinks/Commons/env/albino.tt \ /research/zcentral/www_homes/albino/home/robots.txt cp -p fish_bgd.gif /research/zcentral/www_homes/albino/home/fish_bgd.gif
    Lots and lots of makefile status output deleted here.
    /bin/cp -fp favicon.ico \ /research/zcentral/www_homes/albino/cgi-bin_albino/favicon.ico /bin/touch \ /research/zcentral/www_homes/albino/cgi-bin_albino/favicon.ico gmake[1]: Leaving directory \ `/nfs/research/zusers/clements/Zfin/ZFIN_WWW/cgi-bin' albino ZFIN_WWW%


  6. Make the postloaddb target

    This step is required after each time loaddb.pl is run. It is usually run immediately after loaddb.pl, but that is not possible when you are first setting up your environment (as this example is doing). The postloaddb target removes everything from the database that points back to the database that the data was originally unloaded from, and replaces it with data (web pages, SQL functions, etc) that points to your development web site instead.

    albino ZFIN_WWW% gmake postloaddb gmake -C home postloaddb; gmake -C server_apps postloaddb; \ gmake -C lib postloaddb; gmake[1]: Entering directory \ `/research/zusers/clements/Projects/ZFIN_WWW/home' gmake -C ZFIN postloaddb gmake[2]: Entering directory \ `/research/zusers/clements/Projects/ZFIN_WWW/home/ZFIN' gmake -C APP_PAGES postloaddb
    Lots of makefile output deleted here.
    gmake[1]: Leaving directory \ `/research/zusers/clements/Projects/ZFIN_WWW/lib' albino ZFIN_WWW%
  7. Test the web site

    Everything that is needed for a test web site has now been done. To test it, run netscape and go to your test site's home page. In this example this is:

    It should come up. Click on several static pages and on the database link. They should all come up fine. If not then talk to Dave C.

Updating a File, An Example

This section shows an example of how you might update a page. A subset of this material is described, with graphics, in the Updating a File section.

We noticed that in the process of converting the web site to the new infrastructure 2 items on the ZFIN home page were changed in ways we didn't want them to be. In particular: