Package Forge - The Incoming Queue Processor

Once a user has submitted a job into the incoming directory it is picked up by the incoming queue processor. The submitted job goes through a number of stages of processing and validation before tasks are registered for each target platform and architecture required.

Typically the incoming queue processor runs as a single daemon, on a server usually referred to as the PkgForge master. The daemon will scan the incoming job directory and form a queue from all the jobs found within. It will process each of these jobs in turn, the processing sequence is based on the order in which the readdir function returns them (usually alphabetical based on your locale). Once all discovered jobs have been processed it will wait for a certain amount of time (controlled by the poll option) before doing another scan.

The Processing Stages

The processing of a submitted job can be broken down into a number of basic stages:

  1. Discovery
  2. Loading
  3. Validation
  4. Transfer
  5. Registration
  6. Clean-Up

Stage 1: Discovery

In the first stage any directory which is discovered within the incoming job directory is considered to contain a new job. Anything which is not a directory will have been discarded when the queue of jobs was formed.

Stage 2: Loading

An attempt will be made to load each potential job directory into a PkgForge::Job object. If the loading fails this will not be initially considered as a complete failure. A soft failure will have occurred where the job will be allowed to remain in the incoming directory, in an unloadable form, for up to 5 minutes (controlled by the wait_for_job option). This waiting is done because the new jobs are typically submitted over a network filesystem and it will take a finite amount of time for all the necessary files to become available.

Once a job has been successfully loaded into a PkgForge::Job object the next stage is to check that the identifier string has not been previously used in the job registry. If it has not been seen previously then the new job will be added into the registry database. An entry is added to the job table for the new identifier along with copies of the information in a subset of the options which are specified in the job metadata. Only the information necessary to schedule the job is added (e.g. submission time, submitter name, job size). It is not intended to be a complete copy of the job metadata.

Once a new job has been successfully added into the registry it is in the incoming state.

If the job identifier had been previously seen or it was not possible to add the new job to the registry then an immediate hard failure will occur and there will be no further attempts to process the job. In that situation the job will then move immediately to the Clean-Up stage.

Stage 3: Validation

Once a new job has been loaded it is necessary to validate the associated payload. This is done by calling the validate method on the PkgForge::Job object. Firstly this method checks to see if a

  1. The job must contain at least one source package.
  2. The SHA1 sum of each source package file must be correct.
  3. The source package must be valid.

Presently the checking of the SHA1 sum for each source package is purely to ensure that the file is the same as that submitted by the user. This ensures that it is not still in transit and has not been corrupted. The system has been designed to make it possible to add support for the user to digitally sign the job manifest. Currently it would be possible to alter both a source package and the associated manifest after the user has completed their submission. A digitally-signed manifest could be used to guarantee that no tampering has occurred.

Each source package is represented by a Perl class which implements the PkgForge::Source Moose role. The role requires that the class must implement a validate method which returns true or false to indicate whether or not the source package is valid. Currently only the SRPM file type is supported. For that class the following validity checks are done:

  1. The file name must have a .src.rpm suffix.
  2. The file must exist.
  3. The file must be a valid Source RPM.
  4. The SRPM must contain a file which has the .spec suffix.

If a new job has passed all the validation checks then it will be marked as valid in the registry database. If the new job has failed any checks then, as with the loading stage, it will initially be considered a soft failure and the job will be left in the incoming queue up to the time specified in the wait_for_job option. With each queue run it will be reconsidered to see if it has become valid. This allows time for the complete submission of files over a slow network when a network filesystem is being used. Once that time limit is exceeded then a hard failure will have occurred and the job will be marked in the registry database as invalid. The job would then move immediately the Clean-Up stage.

Stage 4: Transfer

Once the job has been validated it is copied to the directory where accepted jobs are stored. Throughout the copying process the new PkgForge::Job object for the recently validated job is kept in memory. Once the copying is complete this object is used to, once more, check the SHA1 sums of the copied source files to ensure that no tampering or corruption has occurred. This object is also used to write-out a new job metadata file into the accepted job directory.

The intention is that, for security, only the user which the incoming queue processor runs as is permitted to write into the accepted job directory. If a standard, local unix filesystem is being and anything else is run as the same user then this probably does not give much extra confidence. However, if something like AFS is being used then there can be a high-level of trust in the data integrity of the accepted job directory if the write/insert access is highly restricted.

If, for any reason, the transfer fails, then the processing of this job will be considered to have failed. It will then be marked in the registry database as being in the failed state and it will be moved immediately to the Clean-Up stage.

Stage 5: Registration

Once a job has been validated and accepted then tasks can be registered for each platform. Terminology is used here to avoid confusion, a task is purely a job which has been registered for a specific platform/architecture. It should be noted that a job can be considered completely valid but not result in the registration of any tasks. In that case nothing will actually be done with the submitted job and its source packages.

As part of the processing instructions specified by the user, each submitted job has a list of applicable platforms and a list of applicable architectures. Typically, these might actually be just the special "all" string in both cases, as expected, this signifies that tasks should be registered for all platforms and/or architectures. There is plenty of scope for a user to be able to specify and restrict the sets of platforms and architectures for which tasks should be registered, this is fully described in the job documentation. The sets of target platforms and architectures are computed by examining the list of available, active, platforms in the registry database and applying the filters specified by the user for the new job.

Note that having a task registered for an active platform does not guarantee that a build daemon is currently available for that platform. It is perfectly acceptable to queue tasks for a platform which currently has no build daemons registered. It may also, of course, be the case that a build daemon is busy or currently unavailable due to maintenance. It is also worth noting that a platform may have multiple build daemons. It is not possible to guarantee which build daemon will accept a particular task, as they should all have identical build environments this should not cause problems. Full details of the task scheduling is available in the build daemon documentation.

Once tasks have been successfully registered the new job will be marked in the registry database as being in the registered state. If, for any reason, the task registration fails, then the processing of this job will be considered to have failed. It will then be marked in the registry database as being in the failed state and it will be moved immediately to the Clean-Up stage.

Stage 6: Clean-Up

The final stage, no matter which final state a submitted job has achieved, is the clean-up of the incoming queue directory. The directory for the submitted job and all of the contents will be removed.