gLite Data Transfer Agents User's Guide

Paolo Badino

$Id: README.xml,v 1.10 2007/03/12 15:03:17 badino Exp $


Table of Contents

Introduction
Config Service
Agent Deployment Types
GLite Data Transfer Channel Agent
GLite Data Transfer VO Agent
Component Configuration Details
agents-sd
agents-cred-myproxy
transfer-agent-fsm
transfer-agent-scheduler
transfer-agent-catalog-python
transfer-agent-ts-urlcopy
agents-dao-oracle
transfer-agent-dao-oracle
transfer-agent-dao-oracle
transfer-agent-channel-actions
transfer-channel-agent
transfer-agent-vo-actions
transfer-vo-agent
Example of Configuration Files
glite-transfer-channel-agent-channel_1.properties.xml
glite-transfer-vo-agent-EGEE.properties.xml (No Catalog)
glite-transfer-vo-agent-EGEE.properties.xml (With Catalog)
glite-transfer-vo-agent-EGEE.log-properties
Extending VO Actions with Python Scripts
Structure of the FTS VO Agent Configuration File that uses python strategies
agents-python
transfer-agent-vo-actions-python
Python Retry User Guide
glite-transfer-vo-agent-python-EGEE.properties.xml (FTS)
Provide a Specific CatalogService Plugin
Structure of the VO Agent Configuration File that uses python plugins for Catalog interaction and Retry Logic
agents-python
transfer-agent-catalog-python
Python Catalog User Guide
Customizing the Name Generation function
Python NameGeneration User Guide
glite-transfer-vo-agent-EGEE.properties.xml (With Python Plugins)

Abstract

This document contains the information needed to configure thea GLite Data Transfer Agent. For additional information how to install that module, please refer to the gLite Data Transfer Agents Module Install Guide.

Introduction

This package provide the agents that perform the actions concerning the Data Transfers. We distinguished two different kinds of agent: the Channel Agent and the VO Agent.

The Channel Agent is responsible to manage all the file transfers through a channel, i.e. the entity that represent the phisical, monodirectional link between two sites: this agent will fetch the files transfers requests from a Queue and submit them to the configured TransferService.

The other type of agent, the VO Agent, is in charge to perform all the action that are related to a VO specific Virtual Organization, which involves applying VO policies, and Catalog interaction. In case a VO requires Cataog interaction, it has to configure a pluigion to talk to a Catalog Service, in order to retrieve the source and destination SURLs from the Logical File Names (LFNs and GUIDs) and source and destination sites; Once a transfer is completed, the new replicas will also be registered to the appropriate catalog.

One Channel Agent is needed for each channels available on the site, and one VO Agent is needed for each VO that want to perfom data transfers. All of these agents share the same Queue, but the FTA frameworks guarantees that they interact each other in the proper way: a VO Agent is allowed to see just the jobs (and related information) that belong to itself, in the same way a Channel Agent is not able to process requests belonging to other channels. You can imagine that each agent acts on a view of the entire Queue:


              /---------------\
              |     Queue     |
              \---------------/
              |--------       |
    VO_1      || Vo_1 |       |
    Agent =====> View |-------|
              |-------- Ch_1 ||    Channel_1
              |      |  View <=====  Agent
              |--------      ||
    VO_2      || Vo_2 |-------|
    Agent =====> View |-------|      
              |-------- Ch_2 ||    Channel_2 
              |      |  View <=====  Agent
              |--------      ||
    VO_3      || Vo_2 |-------|
    Agent =====> View |       |
              |--------       |
              \---------------/

The way of the Channel and the VO Agent work is the same: they periodically run some action in order to perform the step required to transfer data.

The Channel Agent actions are:

Fetch

Submit new File transfer request to the TransferService

Check

Check the state of all the active File transfer requests and update the Queue with the retrieved information

Cancel

Revoke active file transfers marked as canceling on the Queue

The VO Agent actions are:

Allocate

Allocate a transfer job to a channel based on the source and destination of the related files. In case no SURLs are provided, it performs the name resolution starting from the Logical name and contacting a Catalog service

CheckReadiness

Check if the source files are ready to be transferred, i.e. already staged in the SE cache

Finalize

Perform the finalization of the transfer, registering, if required, the new replicas into a Catalog service

Retry

Reschedule failed transfers that are in waiting state

Cancel

Revoke pending (i.e. not yet processed by the Channel Agent) file transfers marked as canceling on the Queue.

The GLite Data Transfer Agents module provides a default action for all of these types, but it would also allow to extend the behavior of an agent by configuring different actions.

Config Service

The gLite Data Transfer Agents use the gLite Data Config Service modules for retrieving initialization and configuration parameters. For a detailed explanation how that module works please have a look at the gLite Data Config Service User's Guide.

Agent Deployment Types

GLite Data Transfer Channel Agent

The GLite Transfer Agent requires the configuration files named glite-transfer-channel-agent-<CHANNEL_NAME>.properties.xml and glite-transfer-channel-agent-<CHANNEL_NAME>.log-properties in $GLITE_LOCATION_USER/etc, $GLITE_LOCATION/etc or $GLITE_LOCATION_VAR/etc, where <CHANNEL_NAME> is the name of the Channel which the agent is responsible for. The structure of theconfiguration file depends of the type of Queue and Transfer Service the Channel Agent has to contact:


<service name="transfer-channel-agent">
  <components>
    <component name="agents-sd">
      <!-- ServiceDiscovery client Configuration-->
    </component>
    <component name="agents-cred-myproxy">
      <!-- MyProxy client Configuration-->
    </component>  
    <component name="transfer-agent-fsm">
      <!-- FSM Configuration-->
    </component>
    <component name="agents-dao-****">
      <!-- Data Access Object library -->
    </component>
    <component name="transfer-agent-dao-****">
      <!-- Queue (Data Access Object) Connector -->
    </component>
    <component name="transfer-agent-ts-****">
      <!-- Transfer Service Connector -->
    </component>
    <component name="transfer-agent-scheduler">
      <!-- Scheduler Configuration-->
    </component>
    <component name="transfer-agent-channel-actions">
      <!-- Channel Actions Configuration -->
    </component>
    <component name="transfer-channel-agent">
      <!-- Channel Agent Configuration -->
    </component>
  </components>
</service>

where the sections markled with **** depends on the connector that has to be used.

GLite Data Transfer VO Agent

The GLite Transfer Agent requires the configuration files named glite-transfer-vo-agent-<VO_NAME>.properties.xml and glite-transfer-vo-agent-<VO_NAME>.log-properties in $GLITE_LOCATION_USER/etc, $GLITE_LOCATION/etc or $GLITE_LOCATION_VAR/etc, where <VO_NAME> is the name of the Virtual Organization which the Agent belongs to. The structure of the configuration file depends also on the chosen VO Agent deployment type (with or withour catalog interaction) and on the type of the Queue the Agent has to contact. Please note that a VO can install only one VO Agent.

Structure of the VO Agent Configuration File


<service name="transfer-vo-agent">
  <components>
    <component name="agents-sd">
      <!-- ServiceDiscovery client Configuration-->
    </component>
    <component name="agents-cred-myproxy">
      <!-- MyProxy client Configuration-->
    </component>  
    <component name="transfer-agent-fsm">
      <!-- FSM Configuration-->
    </component>
    <component name="agents-dao-****">
      <!-- Data Access Object Library -->
    </component>
    <component name="transfer-agent-dao-****">
      <!-- Queue (Data Access Object) Connector -->
    </component>
    <component name="transfer-agent-scheduler">
      <!-- Scheduler Configuration-->
    </component>
    <component name="transfer-agent-vo-actions">
      <!-- VO Actions Configuration -->
    </component>
    <component name="transfer-vo-agent">
      <!-- VO Agent Configuration -->
    </component>
  </components>
</service>

Structure of the VO Agent Configuration File, with Catalog plugin


<service name="transfer-vo-agent">
  <components>
    <component name="agents-sd">
      <!-- ServiceDiscovery client Configuration-->
    </component>
    <component name="agents-cred-myproxy">
      <!-- MyProxy client Configuration-->
    </component>  
    <!-- Optional
    <component name="transfer-agent-namegen-****">
      <!-- NameGeneration Plugin -->
    </component>-->
    <component name="transfer-agent-catalog-****">
      <!-- Catalog Service client Plugin -->
    </component>
    <component name="transfer-agent-fsm">
      <!-- FSM Configuration-->
    </component>
    <component name="agents-dao-****">
      <!-- Data Access Object Library -->
    </component>
    <component name="transfer-agent-dao-****">
      <!-- Queue (Data Access Object) Connector -->
    </component>
    <component name="transfer-agent-scheduler">
      <!-- Scheduler Configuration-->
    </component>
    <component name="transfer-agent-vo-actions">
      <!-- VO Actions Configuration -->
    </component>
    <component name="transfer-vo-agent">
      <!-- VO Agent Configuration -->
    </component>
  </components>
</service>

where the sections markled with **** depends on the connector/plug-in that has to be used.

Currently, the following Connectors are implemented:

Transfer Service:

Glite-Url-Copy -- transfer-agent-ts-urlcopy

It can be configured to use third party gsiftp transfers (type is "urlcopy") or to issue SRM copy requests (type is "srmcopy")

Data Access Object:

  • Oracle -- transfer-agent-dao-oracle

The requirement is that the configuration should contain one connector per type (transfer-agent-ts-* and transfer-agent-dao-*).

For what concern the Catalog Client, currently we provide:

  • GLite Fireman -- transfer-agent-catalog-fireman ( provided by the module org.glite.data.transfer-agent-fireman)

  • PythonCatalog -- transfer-agent-catalog-python

Note

Please note that a Catalog Client is required only if the VO Agent wants to accept transfer job requests with logicla names. Please also note that the PythonCatalog plugin doesn't provide the functionality to interact with a CatalogService, but it is intendedto act as a "bridge" in order to develop VO specific CatalogService plugin as python scripts

Finally, in case of Catalog interaction is required, it will be possible to specify the strategy to generate the name of the destination file starting from the logical name. This plugin is optional, in the sense that if no component is provided, the FTA would use a default implementation that concatenate the SRM endpoint, the Storage Element path and the logical name. Additional plugins are:

  • *Default* (used if no plugin is provided)

  • PythonNameGen -- transfer-agent-namegen-python

Component Configuration Details

The following section describes the configuration parameter for the GLite Transfer Agents. The configuration for the Connectors/Plug-ins is reported in a dedicated section. Since most of the modules are common for the Channel and VO Agents, deployment model specifics are discussed on the description note. For a structure of the configuration file for a given agent, please check the previous section.

agents-sd

Represents the module that should be use to retieve service information from ServiceDiscovery

Library:

libglite_data_agents_common_sd.so

Initialization:
SRM_ServiceType [expert]

The type used to register SRM Services on ServiceDiscovery

Type: string; Default value: 'SRM'

GridFTP_ServiceType [expert]

The type used to register GRIDFTP Servers on ServiceDiscovery

Type: string; Default value: 'gsiftp'

MyProxy_ServiceType [expert]

The type used to register MyProxy Catalog Services on ServiceDiscovery

Type: string; Default value: 'MyProxy'

FileTransfer_ServiceType [expert]

The type used to register FileTransfer Services on ServiceDiscovery

Type: string; Default value: 'org.glite.FileTransfer'

FileTransferStats_ServiceType [expert]

The type used to register FileTransferStat Services on ServiceDiscovery

Type: string; Default value: 'org.glite.FileTransferStats'

ChannelManagement_ServiceType [expert]

The type used to register ChannelManagement Services on ServiceDiscovery

Type: string; Default value: 'org.glite.ChannelManagement'

ChannelAgent_ServiceType [expert]

The type used to register Channel Agents on ServiceDiscovery

Type: string; Default value: 'org.glite.ChannelAgent'

BDII_ServiceType [expert]

The type used to register BDII Services on ServiceDiscovery

Type: string; Default value: 'BDII'

VOMS_ServiceType [expert]

The type used to register VOMS Services on ServiceDiscovery

Type: string; Default value: 'org.glite.voms'

SEIndex_ServiceType [expert]

The type used to register SEIndex Services on ServiceDiscovery

Type: string; Default value: 'org.glite.SEIndex'

Fireman_ServiceType [expert]

The type used to register Fireman Services on ServiceDiscovery

Type: string; Default value: 'org.glite.Fireman'

SEPath_Property [expert]

The name of the SRM Service Property that provides the path to be used. This is computerd starting from the Glue SAPath property

Type: string; Default value: 'SEMountPoint'

ChannelAgentSource_Property [expert]

The name of the ChannelAgent Service Property that provides the channel source site.

Type: string; Default value: 'source'

ChannelAgentDestination_Property [expert]

The name of the ChannelAgent Service Property that provides the channel destination site.

Type: string; Default value: 'destination'

MyProxyFtsMode_Property [expert]

The name of the MyProxy Server Property that provides the Fts Mode.

Type: string; Default value: 'FtsMode'

MyProxyFtsMode_Retriever [expert]

The value of the MyProxy Server FtsMode Property for retrivers.

Type: string; Default value: 'retriever'

MyProxyFtsMode_Renewer [expert]

The value of the MyProxy Server FtsMode Property for renewers.

Type: string; Default value: 'renewer'

EnableCache [advanced]

Enable caching of ServiceDiscovery information

Type: boolean; Default value: 'true'

Configuration:
Cache_TTL [expert]

The validity, in seconds, of the entries created in the cache. When this time elapsed, the entry is refreshed automatically by the cache next time the user asks for it. Entry related to queries that returned no result are not refreshed. If this parameter is not provided, the internal default will apply (1 hour)

Type: integer; Default value: ''

Cache_StaleTime [expert]

The reduced validity time for the stale entries in the cache: When the validity of an entry expires, the cache try to refresh it using ServiceDiscovery. In case of failure, the entry is considered stale and kept into the cache with a shorter validity. If this parameter is not provided, the internal default will apply (15 minutes)

Type: integer; Default value: ''

Cache_ObsoleteTime [expert]

The number of seconds that should elaps before the entry in the cache are considered obsolete and can then be purged. Since entries are periodically refreshed based on the usage, the purge operation affects only stale entries. If this parameter is not provided, the internal default will apply (5 hours)

Type: integer; Default value: ''

Cache_NegativeObsoleteTime [expert]

The number of seconds that should elaps before considering obsolete the entries related to queries that didn't return any result. These "negative" results are not automatically refreshed, therefore should be cleaned more often. If this parameter is not provided, the internal default will apply (5 minutes)

Type: integer; Default value: ''

agents-cred-myproxy

Represents the mmodule that provide a client to the MyProxy Server for retrieving proxy certificates to be used for the File Transfers

Library:

libglite_data_agents_common_cred_myproxy.so

Initialization:
Server [advanced]

The host name of the MyProxy Server. If that parameter is not set or is empty, the Myproxy Server is looked up using the Service Discovery and then, if not found, the myproxy default will apply (MYPROXY_SERVER environment variable)

Type: string; Default value: ''

Port [advanced]

The port of the MyProxy Server. If that parameter is not set or is 0, the myproxy default will applies

Type: integer; Default value: '0'

ProxyLifetime [advanced]

The lifetime in seconds of the proxy certificates that will be created

Type: integer; Default value: '86400'

Repository [advanced]

The location where the certificates retrieved from the MyProxy Service will be stored. That location must already exist

Type: string; Default value: '/tmp'

MinValidityTime [advanced]

The minimum validity time (in seconds) an already existent certificate should have before submitting a new job. In case the certificate couldn't satisfy that requirement, a new certificate will be retrieved from the MyProxy Service

Type: integer; Default value: '3600'

Configuration:

<No configuration parameters>

transfer-agent-fsm

Represents the library that provides the logic for the File and Job State transitions

Library:

libglite_data_transfer_agent_fsm.so

Initialization:
EnableHold [expert]

When this parameter is set to true, in case of consecutive failures, a transfer will be moved to the "Hold" state, waiting for manual intervention, otherwise it will go in "Failed"

Type: boolean; Default value: 'true'

Configuration:

<No configuration parameters>

transfer-agent-scheduler

This module provides a scheduler class that is able to periodically execute FTA Actions

Library:

libglite_data_transfer_agent_scheduler.so

Initialization:

<No initialization parameters>

Configuration:
MaxFailures [expert]

The number of consecutive failures before an Action is considered disabled for <DisableTime> seconds. If that value is set to zero, actions will never be disabled and the parameter <DisableTime> is ignored

Type: integer; Default value: '0'

DisableTime [expert]

The number of seconds an action stays disabled

Type: integer; Default value: '300'

CheckInterval [expert]

Specify the time interval, in seconds, to periodically check if the DAO context is valid. If 0 is specified,the context is checked on every iteration

Type: integer; Default value: '60'

StopTimeout [expert]

Specify the timeout, in seconds, that a scheduler has to wait for the stopping its thread gracefully. If the timeout elapsed a signal in sent in order to try to abort the running action

Type: integer; Default value: '100'

Catalog Client: Client for retrieving Replica Information from a Catalog Service. In case the no catalog interaction is foreseen, this component can be removed from the configuration file. This module will be loaded at run-time. Currently, the supported plugins are: FiremanCatalog and the python catalog bridge

For the Fireman plugin, please have a look at the org.glite.data.transfer-agent-fireman module.

The python catalog bridge details are:

transfer-agent-catalog-python

This module represents the module that should be used to call a Python script in order to resolve logical file names and register new replicas

Library:

libglite_data_transfer_agent_catalog_python.so

Initialization:
CatalogModule [/mandatory]

The name of the python module that contains the function that would be used in order to check replication permissions, resolve logical file names and register new replicas

Type: string; Default value: ''

CatalogParams [advanced]

This parameter represent an initialization string that will be passed to the python module that contains the catalog plugin, specified by the "CatalogModule" parameter. The format of that string is module dependent

Type: string; Default value: ''

Configuration:

<No configuration parameters>

File Transfer Service Connector: the configuration file for the GLite Transfer Agent should specify one connector for contacting the Transfer Service. This module will be loaded at run-time.

transfer-agent-ts-urlcopy

Represents the connector to use the URLCopy as Transfer Service

Library:

libglite_data_transfer_agent_ts_urlcopy.so

Initialization:
TransferType []

The type of transfer that should be performed. This value can be either "urlcopy" or "srmcopy": In the first case, the TransferService would perform the SRM interaction and start a GridFTP transfer, in the latter the TransferService would delegate the transfer to the SRM by an SRMCopy call

Type: string; Default value: 'urlcopy'

MaxTransfers []

The maximum number of transfer request that the Transfer Service can process simultaneously

Type: integer; Default value: '50'

MaxBulkSize [advanced]

The maximum size for a bulk SrmCopy request. This value is ignored if the "TransferType" parameter is different from "srmcopy"

Type: integer; Default value: '100'

SRMGridFTPSplit [advanced]

Controls the SRM/GridFTP split. If it is true, then the GridFTP session starts only when the SRM operations have finished. If it is false, then the gridFTP session starts with the SRM session (even in the preparation phase).

Type: boolean; Default value: 'false'

SyslogIdent [advanced]

The identification string (log category) for the syslog messages sent by the transfer jobs.

Type: string; Default value: 'FTS'

SyslogFacility [advanced]

The syslog facility for the syslog messages sent by the transfer jobs. Possible values are: "LOG_DAEMON", "LOG_FTP", "LOG_USER".

Type: string; Default value: 'LOG_DAEMON'

SyslogDisable [advanced]

If it is true, the FTS logging to syslog is disabled.

Type: boolean; Default value: 'false'

Configuration:

Data Access Object Connector: the configuration file for the GLite Transfer Agent should specify one connector for contacting the Queue. This module will be loaded at run-time.

agents-dao-oracle

Represents the base module to Access Data (DAO) using Oracle as Data Source

Library:

libglite_data_agents_common_dao_oracle.so

Initialization:
ConnectString [/mandatory]

The Oracle ConnectString identifying the DB

Type: string; Default value: 'localhost'

User [/mandatory]

the name of the user that should be used to connect to the DB

Type: string; Default value: ''

Password [/mandatory]

the password of the user that should be used to connect to the DB

Type: string; Default value: ''

StatementCacheSize [expert]

The Size of the statement Cache.0 means that the caching is disabled. Note: since some memory leaks has been observed, it better for the moment to keep the cache disabled

Type: integer; Default value: '0'

Configuration:

<No configuration parameters>

transfer-agent-dao-oracle

Represents the module to Access Data (DAO) using Oracle as Data Source

Library:

libglite_data_transfer_agent_dao_oracle.so

Initialization:
VOView [expert]

Enable the VO View DAO Factory

Type: string; Default value: 'false'

ChannelView [expert]

Enable the Channel View DAO Factory

Type: string; Default value: 'true'

CredentialsView [expert]

Enable the Credentials View DAO Factory

Type: string; Default value: 'true'

Configuration:

<No configuration parameters>

transfer-agent-dao-oracle

Represents the module to Access Data (DAO) using Oracle as Data Source

Library:

libglite_data_transfer_agent_dao_oracle.so

Initialization:
VOView [expert]

Enable the VO View DAO Factory

Type: string; Default value: 'true'

ChannelView [expert]

Enable the Channel View DAO Factory

Type: string; Default value: 'false'

CredentialsView [expert]

Enable the Credentials View DAO Factory

Type: string; Default value: 'false'

Configuration:

<No configuration parameters>

Channel Agent: this section will report the Channel Agent-specific modules

transfer-agent-channel-actions

Represents the module that provides the Channel Agent Actions. These actions are: <Fetch> : Submit job through the channel; <CheckState> : Check Active Jobs' State; <CancelActive>: Cancel Active Transfers;

Library:

libglite_data_transfer_agent_channel_actions.so

Initialization:
SurlNormalization [expert]

The strategy to be used to change the format of the SURLs. Allowed values are: compact: SURLs would be converted to the format srm://<host>/<sfn> compact-with-port: SURLs would be converted to the format srm://<host>:<port>/<sfn> fully-qualified: SURLs would be converted to the format srm://<host>:<port>/<path>?SFN=<sfn> where <path> is usually retrieved from the SRM endpoint and it's usually <srm/managerv1> disabled: no convertion is performed. The SURLs are processed as submitted by the users

Type: string; Default value: 'compact'

SrmVersion [advanced]

The default SRM version to be used when looking for an SRM endpoint

Type: string; Default value: '1.1'

SrmVersionPolicy [advanced]

The policy that shold be applied when choosing the SRM version to use. Allowed values are: default: if the user provided a fully-qualified SURL, use the endpoint extracted from it, otherwise use the value specified in "SrmVersion" with-space-token: use 2.2 if a space token is provided, otherwise behaves as the default policy force: force to use the version specified in "SrmVersion"

Type: string; Default value: 'default'

EnableDelegation [expert]

Enable Client Credential Delagation. If this value is set to true, transfers would be executed using client delegated credentials, retrieved from MyProxy, otherwise the agent would use its Service credentials.

Type: boolean; Default value: 'true'

Configuration:
MaxFilesToCancel []

The maximum number of Ready/Active files to cancel at each iteration of the Cancel action.

Type: integer; Default value: '500'

transfer-channel-agent

This module represent the agent that should run the actions concerning a Channel. Actions are provided by different libraries that should be loaded into the the agent process using GLite Data Config Service. The Channel Agent requires to run three different kind of action: <Fetch> : Transfer files through the channel using the configured TransferService; <Check> : Check the status of pending TransferService requests; <Cancel>: Cancel the active and pending transfer TransferService request; This configuration assign a default value for all of these types, however it would be possible to provide different action types and use them, just provide the type of the action on the related <action_type>_Type initialization parameter. By default, all the actions (except the Cancel one) are scheduled to be executed periodically with an interval provided by the "DefaultInterval" configuration parameter. In order to modify the sceduling interval based on the action type, just provide a <action_type>_Interval configuration parameter with the value you want to assign. In addition to the actions described above, the Channel Agent also execute other two internal actions: <Heartbeat> : Periodically refresh the status of agent in teh Database <CleanSDCache>: Clean the ServiceDiscovery Cache For what concern the scheduling interval of these actions, the values are specified by the parameters <Heartbeat_Interval> and <CleanSDCache_Interval>

Library:

libglite_data_transfer_channel_agent.so

Initialization:
Name [/mandatory]

The name of the Channel which the agent is responsible for

Type: string; Default value: ''

Contact []

The contact information of the Administrator responsible for that agent

Type: string; Default value: ''

VOShareType [expert]

This parameter specify how to interpret the VO Share values associated to a channel. It should be one the following values: * normalized: the share is the value of the channel voshare property for the given VO, normalized to the sum of all the share for all the VOs in the same channel. This option could be used when channel administrators wantto guarantee slots for certain VOs, in order to implement some sort of QoS, accepting to eventually penalize the total throughput (transfer slots would be reserved to a VO even if that VO has no job to process) * absolute: the share is the value on the channel voshare property expressed as a percentage. No normalization is performed, that means that the sum of all the shares on the same channel can exceed 100%. This option could be used when channel administrators want to balance the share between the VOs, without allowing that a single VO fully allocate a channel but minimizing the risk to allocate slots to VOs that don't have any job to process. This option implies some tuning on the VO share values based on the experience, but it would allow to have a compromise between throughput and QoS * normalized-on-active: the share is the value of the channel voshare property for the given VO, normalized to the sum of all the share for all the VOs in the same channel that has at least one job that can be processed by the Channel Agent (job state should be Active, Pending or Canceling). This option is the default one and could be used when the channel administrators want to optimize the throughput of the channel (the channel can be fully allocated even by one VO), but with a lower QoS

Type: string; Default value: 'normalized-on-active'

Fetch_Type [expert]

The name of the action that provides the logic to fetch transfers

Type: string; Default value: 'glite:Fetch'

Check_Type [expert]

The name of the action that provides the logic to check the state of running transfers

Type: string; Default value: 'glite:CheckState'

Cancel_Type [expert]

The name of the action that provides the logic to cancel active and pending transfers

Type: string; Default value: 'glite:CancelActive'

Configuration:
DefaultInterval []

The default interval, in secons, to be used for scheduling the Channel actions

Type: integer; Default value: '3'

Heartbeat_Interval [expert]

The interval, in seconds, to be used for scheduling the Heartbeat action. The purpose of this action is to periodically update the lastActive timestamp in the t_agent table, in order to demonstrate that agent is up and running. If this value is 0, the Heartbeat action will be disabled

Type: integer; Default value: '60'

CleanSDCache_Interval [expert]

The interval, in seconds, to be used for purging obsolete entries from the ServiceDiscovery cache in order to evaluate changes in the information system. If the SD Cache is disabled, this action doesn't do anything

Type: integer; Default value: '300'

Fetch_Interval [expert]

The interval, in secons, to be used for scheduling the Fetch action. If this parameter is not set or is empty, the Fetch action will be scheduled using the value provided by the "DefaultInterval" parameter. If this value is 0, the Fetch action will be disabled (useful only for debugging purposes)

Type: integer; Default value: ''

Check_Interval [expert]

The interval, in secons, to be used for scheduling the Check State action. If this parameter is not set or is empty, the Check State action will be scheduled using the value provided by the "DefaultInterval" parameter. If this value is 0, the Check action will be disabled (useful only for debugging purposes)

Type: integer; Default value: ''

Cancel_Interval [expert]

The interval, in secons, to be used for scheduling the Cancel action. If this parameter is not set or is empty, the Cancel action will be scheduled using the value provided by the "DefaultInterval" parameter. If this value is 0, the Cancel action will be disabled (useful only for debugging purposes)

Type: integer; Default value: '60'

VO Agent: this section will report the VO Agent-specific modules

transfer-agent-vo-actions

Represents the module that provides the VO Agent Actions. These actions are: <Allocate> : allocate a job into a channel based on the source and destination ; If needed, interact with an external Catalog Service in order to perform the name resolution <CheckReadiness>: check if the files are ready to be transferred (i.e. staged on the disk cache) <BasicRetry> : reschedule transfers that are in waiting state; <Finalize> : Finalize the transfer request and, if needed, register the new replicas into an externl catalog service

Library:

libglite_data_transfer_agent_vo_actions.so

Initialization:
SurlNormalization [expert]

The strategy to be used to change the format of the SURLs. Allowed values are: compact: SURLs would be converted to the format srm://<host>/<sfn> compact-with-port: SURLs would be converted to the format srm://<host>:<port>/<sfn> fully-qualified: SURLs would be converted to the format srm://<host>:<port>/<path>?SFN=<sfn> where <path> is usually retrieved from the SRM endpoint and it's usually <srm/managerv1> disabled: no convertion is performed. The SURLs are processed as submitted by the users

Type: string; Default value: 'compact'

SrmVersion [advanced]

The default SRM version to be used when looking for an SRM endpoint

Type: string; Default value: '1.1'

SrmVersionPolicy [advanced]

The policy that shold be applied when choosing the SRM version to use. Allowed values are: default: if the user provided a fully-qualified SURL, use the endpoint extracted from it, otherwise use the value specified in "SrmVersion" with-space-token: use 2.2 if a space token is provided, otherwise behaves as the default policy force: force to use the version specified in "SrmVersion"

Type: string; Default value: 'default'

EnableDelegation [expert]

Enable Client Credential Delagation. If this value is set to true, vo agent actions would be executed using client delegated credentials, retrieved from MyProxy, otherwise the agent would use its Service credentials. Please note that this option is introduced for testing purposes and therefore is not supported in production

Type: boolean; Default value: 'true'

Configuration:
MaxFailures []

The maximum number of failures before moving the file to Hold (or Failed, depending on "Transfer Agent FSM" EnableHold advanced parameter)

Type: integer; Default value: '3'

ResubmitDelay []

The delay in second before a Waiting file is resubmitted

Type: integer; Default value: '600'

CatalogRetryDelay []

The delay in second before a Catalog Waiting file is retried

Type: integer; Default value: '600'

EnableUnknownSource [advanced]

Enable Unknown Source Site. If this value is set to true, during the allocation phase, in case the source SE is not listed on the Information System, the job would be assigned to the fake "UNKNOWN" site; otherwise the job would fail. This option is useful in order to transfer files from an SE that is on a different Grid

Type: boolean; Default value: 'false'

EnableUnknownDest [advanced]

Enable Unknown Destination Site. If this value is set to true, during the allocation phase, in case the destination SE is not listed on the Information System, the job would be assigned to the fake "UNKNOWN" site; otherwise the job would fail. This option is useful in order to transfer files to an SE that is on a different Grid

Type: boolean; Default value: 'false'

transfer-vo-agent

This module represents the agent that should run the FTS actions that are strictly related to Virtual Organization policies. Actions are provided by different libraries that should be loaded into the the agent process using GLite Data Config Service. The VO Agent requires to run the follwing kinds of actions: <Allocate> : Allocate transfer job to a channel; in case Logical names are provided, it will interact with an external catalog service in order to perform the name resolution; <Retry> : Schedule the resubmission of failed transfers; <Cancel> : Cancel a pending file (not yet queue in the channel queue); <CheckReadiness> : Check if a file is already "online", i.e. available in the disk cahce for transferring <Finalize> : Finalize the request; in case logical names are provided, this action will register the new replicas into an external catalog service; This module provides a default value for all of these types, however it would be possible to provide different action types and use them, by specifying the type of the action on the related <action_type>_Type initialization parameter. By default, all the actions (except the Retry and Cancel ones) are scheduled to be executed periodically with an interval provided by the "DefaultInterval" configuration parameter. The interval for the Retry Action is specified by the <Retry_Interval> parameter, the one for the Cancel Action by <Cancel_Interval>. In order to modify the scheduling interval based on the action type, you can provide a <action_type>_Interval configuration parameter with the value you want to assign. In addition to the actions described above, the VO Agent also execute other two internal actions: <Heartbeat> : Periodically refresh the status of agent in teh Database <CleanSDCache>: Clean the ServiceDiscovery Cache For what concern the scheduling interval of these actions, the values are specified by the parameters <Heartbeat_Interval> and <CleanSDCache_Interval>

Library:

libglite_data_transfer_vo_agent.so

Initialization:
Name [/mandatory]

The name of the VO which the agent belong to

Type: string; Default value: ''

Contact []

The contact information of the Administrator responsible for that agent

Type: string; Default value: ''

DisableDelegationForTransfers [expert]

If that parameter is set to true, the transfers will be performed using the related Channel Agent service credentails, otherwise they will use the client proxy certificate downloaded from MyProxy

Type: boolean; Default value: 'false'

Allocate_Type [expert]

The name of the action type that provides the logic to allocate a transfer job to a channel. In case the source and destination SURLs are not provided, this action will also contact a Catalog Service (if the appropriate plugin is configured) configured in order to perform the name resolution.

Type: string; Default value: 'glite:Allocate'

Retry_Type [expert]

The name of the action type that provides the logic to retry failed transfers

Type: string; Default value: 'glite:BasicRetry'

Cancel_Type [expert]

The name of the action type that provides the logic to cancel pending (i.e. not yet processed by a channel) file transfers

Type: string; Default value: 'glite:CancelPending'

Finalize_Type [expert]

The name of the action type that provides the logic to finalize the transfer, resolve, registering the new replicas into a Catalog Service when the file transfers are completed.

Type: string; Default value: 'glite:Finalize'

CheckReadiness_Type [expert]

The name of the action type that provides the logic to check if the Source files are ready to be transferred, i.e. already available ("online") in the source Storage Element cache .

Type: string; Default value: 'glite:CheckReadiness'

Configuration:
DefaultInterval []

The default interval, in seconds, for scheduling the VO actions

Type: integer; Default value: '3'

Heartbeat_Interval [expert]

The interval, in seconds, to be used for scheduling the Heartbeat action. The purpose of this action is to periodically update the lastActive timestamp in the t_agent table, in order to demonstrate that agent is up and running. If this value is 0, the Heartbeat action will be disabled

Type: integer; Default value: '60'

CleanSDCache_Interval [expert]

The interval, in seconds, to be used for purging obsolete entries from the ServiceDiscovery cache in order to evaluate changes in the information system. If the SD Cache is disabled, this action doesn't do anything

Type: integer; Default value: '300'

Allocate_Interval [expert]

The interval, in seconds, to be used for scheduling the Allocate action. If this parameter is not set or is empty, the Allocate action will be scheduled using the value provided by the "DefaultInterval" parameter. If this value is 0, the Allocate action will be disabled (useful only for debugging purposes)

Type: integer; Default value: ''

Retry_Interval [expert]

The interval, in seconds, to be used for scheduling the Retry action. If this parameter is not set or is empty, the Retry action will be scheduled every 60 seconds. If this value is 0, the Retry action will be disabled (useful only for debugging purposes)

Type: integer; Default value: '60'

Cancel_Interval [expert]

The interval, in seconds, to be used for scheduling the Cancel action. If this parameter is not set or is empty, the Cancel action will be scheduled using the value provided by the "DefaultInterval" parameter. If this value is 0, the Cancel action will be disabled (useful only for debugging purposes)

Type: integer; Default value: '60'

Finalize_Interval [expert]

The interval, in seconds, to be used for scheduling the Finalize action. If this parameter is not set or is empty, the Finalize action will be scheduled using the value provided by the "DefaultInterval" parameter. If this value is 0, the Finalization action will be disabled (useful only for debugging purposes).

Type: integer; Default value: ''

CheckReadiness_Interval [expert]

The interval, in seconds, to be used for scheduling the CheckReadiness action. If this parameter is not set or is empty, the CheckReadiness action will be scheduled using the value provided by the "DefaultInterval" parameter. If this value is 0, the CheckReadiness action will be disabled (useful only for debugging purposes).

Type: integer; Default value: ''

Example of Configuration Files

This section contains some example of the GLite Data Transfer Channel and VO Agent configuration files. Those values should be changed according to the environment where the GLite Data Transfer Channel Agent is installed. Fireman, UrlCopy and Oracle connectors are used.

glite-transfer-channel-agent-channel_1.properties.xml


<?xml version="1.0" encoding="UTF8"?>
<service>
  <components>
    <component name="agents-sd">
      <lib>libglite_data_agents_common_sd.so</lib>
    </component>
    <component name="agents-cred-myproxy">
      <lib>libglite_data_agents_common_cred_myproxy.so</lib>
      <init>
        <param name="Server">
          <value>lxb1414.cern.ch</value>
        </param>
      </init>
    </component>
    <component name="transfer-agent-fsm">
      <lib>libglite_data_transfer_agent_fsm.so</lib>
    </component>
    <component name="agents-dao-oracle">
      <lib>libglite_data_agents_common_dao_oracle.so</lib>
      <init>
        <param name="ConnectString">
          <value>@DBNAME</value>
        </param>
        <param name="User">
          <value>@DBUSER</value>
        </param>
        <param name="Password">
          <value>@DBPASSWORD</value>
        </param>
      </init>
    </component>
    <component name="transfer-agent-dao-oracle">
      <lib>libglite_data_transfer_agent_dao_oracle.so</lib>
      <init>
        <param name="VOView">
          <value>false</value>
        </param>
        <param name="ChannelView">
          <value>true</value>
        </param>
      </init>
    </component>
    <component name="transfer-agent-ts-urlcopy">
      <lib>libglite_data_transfer_agent_ts_urlcopy.so</lib>
      <dependencies>
        <lib>libglite_data_transfer_url_copy.so</lib>
      </dependencies>
      <init>
        <param name="MaxTransfers">
          <value>3</value>
        </param>
      </init>
      <config>
        <param name="LogLevel">
          <value>DEBUG</value>
        </param>
      </config>
    </component>
    <component name="transfer-agent-scheduler">
      <lib>libglite_data_transfer_agent_scheduler.so</lib>
    </component>
    <component name="transfer-agent-channel-actions">
      <lib>libglite_data_transfer_agent_channel_actions.so</lib>
    </component>
    <component name="transfer-channel-agent">
      <lib>libglite_data_transfer_channel_agent.so</lib>
      <init>
        <param name="Name">
          <value>channel_1</value>
        </param>
        <param name="DefaultInterval">
          <value>3</value>
        </param>
      </init>
    </component>
  </components>
</service>

glite-transfer-vo-agent-EGEE.properties.xml (No Catalog)


<?xml version="1.0" encoding="UTF8"?>
<service>
  <components>
    <component name="agents-sd">
      <lib>libglite_data_agents_common_sd.so</lib>
    </component>
    <component name="agents-cred_myproxy">
      <lib>libglite_data_agents_common_cred_myproxy.so</lib>
      <init>
        <param name="Server">
          <value>lxb1414.cern.ch</value>
        </param>
      </init>
    </component>
    <component name="transfer-agent-fsm">
      <lib>libglite_data_transfer_agent_fsm.so</lib>
    </component>
    <component name="agents-dao-oracle">
      <lib>libglite_data_agents_common_dao_oracle.so</lib>
      <init>
        <param name="ConnectString">
          <value>@DBNAME</value>
        </param>
        <param name="User">
          <value>@DBUSER</value>
        </param>
        <param name="Password">
          <value>@DBPASSWORD</value>
        </param>
      </init>
    </component>
    <component name="transfer-agent-dao-oracle">
      <lib>libglite_data_transfer_agent_dao_oracle.so</lib>
      <init>
        <param name="VOView">
          <value>true</value>
        </param>
        <param name="ChannelView">
          <value>false</value>
        </param>
      </init>
    </component>
    <component name="transfer-agent-scheduler">
      <lib>libglite_data_transfer_agent_scheduler.so</lib>
    </component>
    <component name="transfer-agent-vo-actions">
      <lib>libglite_data_transfer_agent_vo_actions.so</lib>
    </component>
    <component name="transfer-vo-agent">
      <lib>libglite_data_transfer_vo_agent.so</lib>
      <init>
        <param name="Name">
          <value>EGEE</value>
        </param>
        <param name="DefaultInterval">
          <value>3</value>
        </param>
      </init>
    </component>
  </components>
</service>

glite-transfer-vo-agent-EGEE.properties.xml (With Catalog)


<?xml version="1.0" encoding="UTF8"?>
<service>
  <components>
    <component name="agents-sd">
      <lib>libglite_data_agents_common_sd.so</lib>
    </component>
    <component name="agents-cred-myproxy">
      <lib>libglite_data_agents_common_cred_myproxy.so</lib>
      <init>
        <param name="Server">
          <value>lxb1414.cern.ch</value>
        </param>
      </init>
    </component>
    <component name="transfer-agent-catalog-fireman">
     <lib>libglite_data_transfer_agent_catalog_fireman.so</lib>
    </component>
    <component name="transfer-agent-fsm">
      <lib>libglite_data_transfer_agent_fsm.so</lib>
    </component>
    <component name="agents-dao-oracle">
      <lib>libglite_data_agents_common_dao_oracle.so</lib>
      <init>
        <param name="ConnectString">
          <value>@DBNAME</value>
        </param>
        <param name="User">
          <value>@DBUSER</value>
        </param>
        <param name="Password">
          <value>@DBPASSWORD</value>
        </param>
      </init>
    </component>
    <component name="transfer-agent-dao-oracle">
      <lib>libglite_data_transfer_agent_dao_oracle.so</lib>
      <init>
        <param name="VOView">
          <value>true</value>
        </param>
        <param name="ChannelView">
          <value>false</value>
        </param>
      </init>
    </component>
    <component name="transfer-agent-scheduler">
      <lib>libglite_data_transfer_agent_scheduler.so</lib>
    </component>
    <component name="transfer-agent-vo-actions">
      <lib>libglite_data_transfer_agent_vo_actions.so</lib>
    </component>
    <component name="transfer-vo-agent">
      <lib>libglite_data_transfer_vo_agent.so</lib>
      <init>
        <param name="Name">
          <value>EGEE</value>
        </param>
        <param name="DefaultInterval">
          <value>3</value>
        </param>
      </init>
    </component>
  </components>
</service>

glite-transfer-vo-agent-EGEE.log-properties

An example of a logging configuration file is:

log4j.rootCategory=DEBUG, file
                                                                                
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout=org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%m%n
                                                                                
log4j.appender.file=org.apache.log4j.FileAppender
log4j.appender.file.fileName=/var/log/glite/glite-transfer-vo-agent-EGEE.log
log4j.appender.file.layout=org.apache.log4j.PatternLayout
log4j.appender.file.layout.ConversionPattern=%d %-6p %c - %m%n

The agent configuration template file can then be generated by concatenating the differenet transfer-agent-*.config-template.xml files contained into the config folder. This configuration template file can then be used to generate the configuration file as explained in the gLite Data Transfer Agents Install Guide document.

Extending VO Actions with Python Scripts

All the VO Agent Actions can be overridden specifying the appropriate action type in the vo-agent configuration. In order to do that, the library providing the action should be loaded inside the process (using the Glite Service Configurator Library) and the related action types shoudl be registered in ActionFactory. However, this approach is quite complex and requires that the actions reimplement most of the logic already provided by the default ones. Due to that reason, from version 2.2.0 FileTransferAgent starts adopting the Strategy Pattern, decoupling the logic of the action (that a VO may want to customize) from the common tasks (get instances from the DB, update changes, etc...). This new approach, combined with the possibility to call python scripts within the VO Agent process, would allow then to specify these strategy in a easier way. At the moment, the only action startegy that can be overridden is the Retry, but new ones will be added in future versions.

In order to use this feature, the structure of the VO Agent shold be modified as illustrated below:

Structure of the FTS VO Agent Configuration File that uses python strategies


<service name="transfer-vo-agent">
  <components>
    <component name="agents-sd">
      <!-- ServiceDiscovery client Configuration-->
    </component>
    <component name="agents-cred-myproxy">
      <!-- MyProxy client Configuration-->
    </component>  
    <component name="agents-python">
      <!-- Python interpreter module -->
    </component>
    <component name="transfer-agent-fsm">
      <!-- FSM Configuration-->
    </component>
    <component name="agents-dao-****">
      <!-- Data Access Object Library -->
    </component>
    <component name="transfer-agent-dao-****">
      <!-- Queue (Data Access Object) Connector -->
    </component>
    <component name="transfer-agent-scheduler">
      <!-- Scheduler Configuration-->
    </component>
    <component name="transfer-agent-vo-actions">
      <!-- VO Actions Configuration -->
    </component>
    <component name="transfer-agent-vo-actions-python">
      <!-- VO Python Actions Configuration -->
    </component>
    <component name="transfer-vo-agent">
      <!-- VO Agent Configuration -->
    </component>
  </components>
</service>

The configuration properties of these new modules are:

agents-python

The purpose of this module is to load and initialize the Python interpreter in order to run python script inside an Agent process. The module also initialize some wrapper modules that expose the main agents-common functionalities

Library:

libglite_data_agents_common_python.so

Initialization:
PythonPath []

The location where the python interpret look for modules and packages. The value can be a list of directory path separated by ':' (in a unix-like sistem). If this property is set, the module would add the value to the PYTHONPATH environment variable

Type: string; Default value: ''

Configuration:

<No configuration parameters>

transfer-agent-vo-actions-python

Represents the module that overrides some VO Actions allowing to specify the strategy as a python script These actions are, at the moment: <Retry>: Retry failed transfers or move them to hold;

Library:

libglite_data_transfer_agent_vo_actions_python.so

Initialization:
RetryModule [/mandatory]

The name of the python module that contains the function that would be evaluated in order to choose if a file has to be retried, failed or let it in Waiting state. If you want to configure the VO agent in order to use the retry action that call this module, you have to set the property "transfer-vo-agent.Retry_Type" to "glite:PythonRetry"

Type: string; Default value: ''

RetryParams [advanced]

This parameter represent an initialization string that will be passed to the python module that contains the retry logic, specified by the "RetryModule" parameter. The format of that string is module dependent

Type: string; Default value: ''

CatalogRetryModule []

The name of the python module that contains the function that would be evaluated in order to choose if a file has to be retried, failed or let it in a waiting state after a failure during the catalog interaction. In case the VO doesn't configure a Catalog plugin, this parameter can be left empty

Type: string; Default value: ''

CatalogRetryParams [advanced]

This parameter represent an initialization string that will be passed to the python module that contains the Catalog retry logic, specified by the "CatalogRetryModule" parameter. The format of that string is module dependent

Type: string; Default value: ''

Configuration:

<No configuration parameters>

Please note that in order to be able to call the strategy defined in the python script you have to set the expert property "transfer-vo-agent.Retry_Type" to "glite:PythonRetry"

In case of the VO Agent configured to handle transfers starting from the logical name is exactly the same, you need just to to configure a Catalog Connector.

The current release of transfer-agents provides two python retry strategies, located in GLITE_LOCATION/lib/python/glite/fts/strategies:

  • basic_retry.py: the python implementation of the default BasicRetry Strategy. No configuration parameters are exposed this script

  • smarter_retry.py a more advanced retry logic that check for the last transfer failures: fail immediately if source files doesn't exist; wait more if a timeout on get is received;delete destination file if a "file exists" error follows a "transfer error". This script accepts the following configuration parameters, that you could pass through the "transfer-agent-vo-actions-python.RetryParams":

    • MaxFailures : The maximum number of failures allowed for a file (Default:100)

    • MaxFileExistsFailures : The maximum number of consecutive "File Exists" failures (Default: 3)

    • HoldEnabled : If this value is set to true, when a file fails,it will be moved to HOLD. If this value is set to false, the file status will be FAILED(Default: true)

    • OverwriteFailedFiles : When the strategy detect that a "file exists" failure is due by a incomplete cleanup of a previous failed transfer, delete the destination file before performing the new attempt (Default: true)

    • OverwriteExistingFiles : In case of "file exists" failure, always delete the destination before retrying the transfer (Default: false)

    • DefaultRetryDelay : The default interval (in seconds) before retyring a transfer (Default:600)

    • RetryDelayForTimeoutOnGet : The interval (in seconds) before retyring a transfer that failed with a "Timeout on Get" error (Default:1800)

    • RetryDelayForDestFileExists : The interval (in seconds) before retyring a transfer that failed at the destination (Default:600)

    • RetryDelayForDestFileExists : The interval (in seconds) before retyring a transfer that failed at the destination (Default:600)

    • SrmServiceType : The service type that identify an SRM into the information systrem (Default: SRM)

In case you're interested in providing other strategies, you have to comply with the following rules:

Python Retry User Guide

Python Retry User Guide

In order provide a Python script that can be used as Retry Strategy for the gLite File Transfer Agent, you need to provide the following functions:

  • RetryVersion(): This method should return a string containing the version of the Retry Strategy interface. The rationale for this method is that based on the returned value, the FTA will be able to insure backward-compatibility with script compliant to older versions. The current version of the FTA support "1.0".

  • InitRetry(params = ""): [Optional] This method is called to perform the initialization and should return True in case of success, False otherwise. The input parameters are:

    • params: [Optional] a string containing some configuration parameters. This string is retrieved form the FTA configuration and passed as it is. The format is specific to the script

  • Retry(job,file,transfers): This method provides the retry strategy that should be applied for the failed files. The input parameters are:

    • job: the Transfer Job request object

    • file: the Transfer File object that represent the file to evaluate

    • transfers: a list containing all the transfer attempts already performed for the given file, ordered from the more recent one

    This method should return a glite.fts.RetryResult value:

    • Wait : Not enough time is elapsed from the last try

    • Retry: File should be retried

    • Hold : File should be moved in Hold State

    • Fail : File should be considered Failed

In order provide a Python script that can be used as Catalog Retry Strategy for the gLite File Transfer Agent, you need to provide the following functions:

  • CatalogRetryVersion(): This method should return a string containing the version of the Catalog Retry Strategy interface. The rationale for this method is that based on the returned value, the FTA will be able to insure backward-compatibility with script compliant to older versions. The current version of the FTA support "1.0".

  • InitCatalogRetry(params = ""): [Optional] This method is called to perform the initialization and should return True in case of success, False otherwise. The input parameters are:

    • params: [Optional] a string containing some configuration parameters. This string is retrieved form the FTA configuration and passed as it is. The format is specific to the script

  • CatalogRetry(job,files): This method provides the Catalog retry strategy that should be applied for the files failed durting Catalog interation. The input parameters are:

    • job: the Transfer Job request object

    • files: the list of Transfer File objects that represent the file to evaluate

    This method should return a glite.fts.CatalogRetryResult value:

    • Wait : Not enough time is elapsed from the last try

    • Retry: File should be retried

    • Fail : File should be considered Failed

Please note that tyhe usage of the Catalog Retry Strategy is not compulsory. In case the Python Strategy bridge is used and no module is specified for overriding Catalog Retry Strategy, all the files that encountered an error during the Catalog interaction would be failed

The following section illustrate an example of configuration that use a python script as retry strategy

glite-transfer-vo-agent-python-EGEE.properties.xml (FTS)


<?xml version="1.0" encoding="UTF8"?>
<service>
  <components>
    <component name="agents-sd">
      <lib>libglite_data_agents_common_sd.so</lib>
    </component>
    <component name="agents-cred-myproxy">
      <lib>libglite_data_agents_common_cred_myproxy.so</lib>
    </component>
    <component name="transfer-agent-fsm">
      <lib>libglite_data_transfer_agent_fsm.so</lib>
    </component>
    <component name="agents-dao-oracle">
      <lib>libglite_data_agents_common_dao_oracle.so</lib>
      <init>
        <param name="ConnectString">
          <value>@DBNAME</value>
        </param>
        <param name="User">
          <value>@DBUSER</value>
        </param>
        <param name="Password">
          <value>@DBPASSWORD</value>
        </param>
      </init>
    </component>
    <component name="transfer-agent-dao-oracle">
      <lib>libglite_data_transfer_agent_dao_oracle.so</lib>
      <init>
        <param name="VOView">
          <value>true</value>
        </param>
        <param name="ChannelView">
          <value>false</value>
        </param>
      </init>
    </component>
    <component name="transfer-agent-scheduler">
      <lib>libglite_data_transfer_agent_scheduler.so</lib>
    </component>
    <component name="transfer-agent-python">
      <lib>libglite_data_transfer_agent_python.so</lib>
      <init>
        <param name="PythonPath">
            <value>/opt/glite/lib/python2.2/site-packages:
                /opt/glite/lib/python/glite/fts/strategies/</value>
        </param>
      </init>
    </component>
    <component name="transfer-agent-vo-actions">
      <lib>libglite_data_transfer_agent_vo_actions.so</lib>
    </component>
    <component name="transfer-agent-vo-actions-python">
      <lib>libglite_data_transfer_agent_vo_actions_python.so</lib>
      <init>
        <param name="RetryModule"> 
          <value>smarter_retry</value> 
        </param>
        <param name="RetryParams"> 
          <value>
            MaxFailures = 10 ;
            HoldEnabled  = true ;
            OverwriteFailedFiles = true ;
            OverwriteExistingFiles = false ;
            DefaultRetryDelay  = 60  ; 
            RetryDelayForTimeoutOnGet   = 300 ;
            RetryDelayForDestFileExists = 60 ;
          </value>
        </param>
      </init>
    </component>
    <component name="transfer-vo-agent">
      <lib>libglite_data_transfer_vo_agent.so</lib>
      <init>
        <param name="Name">
          <value>EGEE</value>
        </param>
        <param name="Retry_Type">
          <value>glite:PythonRetry</value> 
        </param>
      </init>
    </component>
  </components>
</service>

Provide a Specific CatalogService Plugin

A VO can provide its specific CatalogService Plugin by providing an implementation of the CatalogService interface defined in /glite/data/transfer/agents/catalog/CatalogService.h . In order to do that, the library providing that class should be loaded inside the process (using the Glite Service Configurator Library) as well as a Factory to create that instance (it uses the AbstractFactory Pattern: see /glite/data/transfer/agents/catalog/CatalogServiceFactory.h). However, this approach is slightly complex and has some deployment implications (e.g. possible missmatch between the versions of FTA and the provided library). Due to that reason, from version 2.3.0 FileTransferAgent introduced a Python CatalogService Connector that will implement the CatalogService interface by delegating to a python script the execution of the related methods. This approach, combined with the possibility to call python scripts within the VO Agent process, would allow then teh VOs to specify theirs own connector in a easier way.

In order to use this feature, the structure of the VO Agent Configuration shold be modified as illustrated below:

Structure of the VO Agent Configuration File that uses python plugins for Catalog interaction and Retry Logic


<service name="transfer-vo-agent">
  <components>
    <component name="agents-sd">
      <!-- ServiceDiscovery client Configuration-->
    </component>
    <component name="agents-cred-myproxy">
      <!-- MyProxy client Configuration-->
    </component>  
    <component name="agents-python">
      <!-- Python interpreter module -->
    </component>
    <component name="transfer-agent-fsm">
      <!-- FSM Configuration-->
    </component>
    <component name="transfer-agent-catalog-python">
      <!-- PythonCatalog Configuration-->
    </component>
    <component name="agents-dao-****">
      <!-- Data Access Object Library -->
    </component>
    <component name="transfer-agent-dao-****">
      <!-- Queue (Data Access Object) Connector -->
    </component>
    <component name="transfer-agent-scheduler">
      <!-- Scheduler Configuration-->
    </component>
    <component name="transfer-agent-vo-actions">
      <!-- VO Actions Configuration -->
    </component>
    <component name="transfer-agent-vo-actions-python">
      <!-- VO Python Actions Configuration -->
    </component>
    <component name="transfer-vo-agent">
      <!-- VO Agent Configuration -->
    </component>
  </components>
</service>

The configuration properties of these new modules are:

agents-python

The purpose of this module is to load and initialize the Python interpreter in order to run python script inside an Agent process. The module also initialize some wrapper modules that expose the main agents-common functionalities

Library:

libglite_data_agents_common_python.so

Initialization:
PythonPath []

The location where the python interpret look for modules and packages. The value can be a list of directory path separated by ':' (in a unix-like sistem). If this property is set, the module would add the value to the PYTHONPATH environment variable

Type: string; Default value: ''

Configuration:

<No configuration parameters>

transfer-agent-catalog-python

This module represents the module that should be used to call a Python script in order to resolve logical file names and register new replicas

Library:

libglite_data_transfer_agent_catalog_python.so

Initialization:
CatalogModule [/mandatory]

The name of the python module that contains the function that would be used in order to check replication permissions, resolve logical file names and register new replicas

Type: string; Default value: ''

CatalogParams [advanced]

This parameter represent an initialization string that will be passed to the python module that contains the catalog plugin, specified by the "CatalogModule" parameter. The format of that string is module dependent

Type: string; Default value: ''

Configuration:

<No configuration parameters>

Please note that you can also configure the Python Retry Strategy described in the paragraph above

In case you're interested in providing such connector, you have to comply with the following rules:

Python Catalog User Guide

In order provide a Python script that can be used as CatalogService Connector for the gLite File Transfer Agent, you need to provide the following functions:

  • CatalogPluginVersion(): This method should return a string containing the version of the PythonCatalog interface. The rationale for this method is that based on the returned value, the FTA will be able to insure backward-compatibility with script compliant to older versions. The current version of the FTA support "1.0".

  • InitCatalogPlugin(params = ""): [Optional] This method is called to perform the initialization and should return True in case of success, False otherwise. The input parameters are:

    • params: [Optional] a string containing some configuration parameters. This string is retrieved form the FTA configuration and passed as it is. The format is specific to the script

  • GlobalCatalogType(): This method should return a string containing the type of the Global CatalogService the plugin is using. This value is then use to look up the CatalogService endpoint in the information system. In case the VO uses local Catalogs, this method can be skipped

  • LocalCatalogType(): This method should return a string containing the type of the Local CatalogService the plugin is using. This value is then use to look up the CatalogService endpoint in the information system. In case the VO uses a global Catalog, this method can be skipped

  • GetEndpoint(site,vo_name): This method should return a StringPair that contain the endpoint and the type of the CatalogService service that should be used when the Agent should access/register files in the given site. If this method is not provided, the VO Agent will then look up the CatalogService endpoint in the information system using the types returned by the LocalCatalogType and GlobalCatalogType calls. The input parameters are:

    • site: the name of the site where the file is/has to be registered

    • vo_name: the name of the VO which the Agent is responsible for

  • CheckPermissions(endpoint, type, names): This method should check that the user is allowed to replicate the files indicated by the logical names listed in the "names" parameter. The credentials of the user can be retrieved through the X509_USER_PROXY environment variable, that is set by the FTA before the method is called. The "endpoint" and "type" input parameters contain the endpoint and the type of the CatalogService the plugin should contact. The result of the method should be an instance of the glite.fts.catalog.CatalogResult class. In case the user is authorized, the CatalogResult.Status field should be set to ResultStatus.Success, otherwise CatalogResult.Status should be set to ResultStatus.Failed and CatalogResult.Reason should contain the reason of the failure. In case only some of the files can't be authorized, you can return all the detailed information by setting the CatalogResult.Status to ResultStatus.SomeFailures and reason of each failures in CatalogResult.Failures, an array that should holds the (LogicalName, ErrorReason) pairs. The input parameters are:

    • endpoint: the Catalog service endpoint

    • type: the Catalog service type

    • names: the list of logical names

  • ListSurls(endpoint, type, names, source): This method should returns all the replicas registered in the CatalogSevrice identified by the "endpoint" and "type" input parameters, for all the logical names listed in "names". The credentials of the user can be retrieved through the X509_USER_PROXY environment variable, that is set by the FTA before the method is called. The optional parameter source can contain the requested source site/SE in order to provide an hint to return only the SURLs involved. The result of the method should be an instance of the glite.fts.catalog.ListSurlsResult class. In case of success, the ListSurlsResult.Status field should be set to ResultStatus.Success and the ListSurlsResult.Surls field should contain the list of the replicas' SURLs (the type of this field is glite.fts.catalog.StringArray2, an array of string arrays). In case of errors ListSurlsResult.Status should be set to ResultStatus.Failed (or ResultStatus.SomeFailures) and ListSurlsResult.Reason should contain the reason of the failure. In case the status is set to ResultStatus.SomeFailures, you can provide the list of failed files and the related reason in the ListSurlsResult.Failures field, an array that should holds the (LogicalName, ErrorReason) pairs. Please note that in case you want to provide these details, you have to fill the ListSurlsResult.Surls with the list of Surls for the file that succeeded and an empty array for the failed ones (the ones listed in ListSurlsResult.Failures)

    • endpoint: the Catalog service endpoint

    • type: the Catalog service type

    • names: the list of logical names

    • source: [Optional] the source SE that can be used to filter the request

  • CheckSurls(endpoint, type, names): This method should check that the files identified by the "logical name-surl" pairs listed in the "names" parameter don't already exist. The credentials of the user can be retrieved through the X509_USER_PROXY environment variable, that is set by the FTA before the method is called. The "endpoint" and "type" input parameters contain the endpoint and the type of the CatalogService the plugin should contact. The result of the method should be an instance of the glite.fts.catalog.CatalogResult class. In case the user is authorized, the CatalogResult.Status field should be set to ResultStatus.Success, otherwise CatalogResult.Status should be set to ResultStatus.Failed and CatalogResult.Reason should contain the reason of the failure. In case only some of the files already exist, you can return all the detailed information by setting the CatalogResult.Status to ResultStatus.SomeFailures and reason of each failures in CatalogResult.Failures, an array that should holds the (LogicalName, ErrorReason) pairs.

    • endpoint: the Catalog service endpoint

    • type: the Catalog service type

    • names: the list of "logical name-surl" pairs

  • GetStats(endpoint, type, names): This method should return the stat info of the files identified by the the "logical name-surl" pairs listed in the "names" parameter. The purpose of this method is to synchronize the source and destination catalogs and therefore, in case the endpoints of these catalog are the same, this method is not called. A Plugin implementation should derive it's own implementation of the FileStat class, adding the stat info and the related metadata that should be synchronized between the catalogs; the FTA would then take the returned object and pass it as opaque pointer to the the RegisterSurls method. The credentials of the user can be retrieved through the X509_USER_PROXY environment variable, that is set by the FTA before the method is called. The "endpoint" and "type" input parameters contain the endpoint and the type of the CatalogService the plugin should contact. The result of the method should be an instance of the glite.fts.catalog.CatalogResult class. In case the user is authorized, the CatalogResult.Status field should be set to ResultStatus.Success, otherwise CatalogResult.Status should be set to ResultStatus.Failed and CatalogResult.Reason should contain the reason of the failure. In case only some of the files failed, you can return all the detailed information by setting the CatalogResult.Status to ResultStatus.SomeFailures and reason of each failures in CatalogResult.Failures, an array that should holds the (LogicalName, ErrorReason) pairs.

    • endpoint: the Catalog service endpoint

    • type: the Catalog service type

    • names: the list of "logical name-surl" pairs

  • RegisterSurls(endpoint, type, replicas): This method should register the new replicas in the CatalogSevrice identified by the "endpoint" and "type" input parameters. The replics are contained in "names", an array of ReplicaStat object (properties are Logical, Surl, FileStat). The credentials of the user can be retrieved through the X509_USER_PROXY environment variable, that is set by the FTA before the method is called. The result of the method should be an instance of the glite.fts.catalog.CatalogResult class. If the registration succeed, the CatalogResult.Status field should be set to ResultStatus.Success, otherwise CatalogResult.Status should be set to ResultStatus.Failed (or ResultStatus.SomeFailures) and CatalogResult.Reason should contain the reason of the failure. In case the result is set to ResultStatus.SomeFailures, you can provide the list of failed files and the related reason by filling the CatalogResult.Failures field, an array that should holds the (LogicalName, ErrorReason) pairs.

    • endpoint: the Catalog service endpoint

    • type: the Catalog service type

    • replicas: the list of ReplicaStat objects

Customizing the Name Generation function

In addition to the above extension, you may be interested in providing a plugin for defining the function that generates the destination SURL. As in the other cases, this is possible either providing an implementation of the interface defined in glite/transfer/agent/namegen/NameGeneration.h, or via a python script. In case you'll choose this option, the configuration properties are:

transfer-agent-namegen-python

This module represents the module that should be used to call a Python script in order to generate names of new replicas

Library:

libglite_data_transfer_agent_namegen_python.so

Initialization:
NameGenModule []

The name of the python module that contains the function that would be used in order to generate the name of new replicas starting from the logical name. If this parameter is not provided or is empty, the default implementation would be used

Type: string; Default value: ''

NameGenParams [advanced]

This parameter represent an initialization string that will be passed to the python module that contains the NameGeneration plugin, specified by the "NameGenModule" parameter. The format of that string is module dependent

Type: string; Default value: ''

Configuration:

<No configuration parameters>

In that case, the structure of an VO Agent Configuration File that uses python catalog and namegeneration plugins would be:


<service name="transfer-vo-agent">
  <components>
    <component name="agents-sd">
      <!-- ServiceDiscovery client Configuration-->
    </component>
    <component name="agents-cred-myproxy">
      <!-- MyProxy client Configuration-->
    </component>  
    <component name="agents-python">
      <!-- Python interpreter module -->
    </component>
    <component name="transfer-agent-fsm">
      <!-- FSM Configuration-->
    </component>
    <component name="transfer-agent-namegen-python">
      <!-- Python NameGeneration Configuration-->
    </component>
    <component name="transfer-agent-catalog-python">
      <!-- PythonCatalog Configuration-->
    </component>
    <component name="agents-dao-****">
      <!-- Data Access Object Library -->
    </component>
    <component name="transfer-agent-dao-****">
      <!-- Queue (Data Access Object) Connector -->
    </component>
    <component name="transfer-agent-scheduler">
      <!-- Scheduler Configuration-->
    </component>
    <component name="transfer-agent-vo-actions">
      <!-- VO Actions Configuration -->
    </component>
    <component name="transfer-agent-vo-actions-python">
      <!-- VO Python Actions Configuration -->
    </component>
    <component name="transfer-vo-agent">
      <!-- VO Agent Configuration -->
    </component>
  </components>
</service>

Please note that the previously described component "agents-python" is also needed.

In case you're interested in providing such function, you have to comply with the following rules:

Python NameGeneration User Guide

In order provide a Python script that can be used as to generate the name of the for the destination SURL when a logical name is provided, you need to provide the following functions:

  • NameGenVersion(): This method should return a string containing the version of the NameGeneration interface. The rationale for this method is that based on the returned value, the FTA will be able to insure backward-compatibility with script compliant to older versions. The current version of the FTA support "1.0".

  • InitNameGen(params = ""): [Optional] This method is called to perform the initialization and should return True in case of success, False otherwise. The input parameters are:

    • params: [Optional] a string containing some configuration parameters. This string is retrieved form the FTA configuration and passed as it is. The format is specific to the script

  • Generate(logical,endpoint,se_path,storage_class): This method should return the surl of the destination file strating form the input parameters are:

    • logical: the logicla name of the file, as specified by the used in the Transfer Job Request

    • endpoint: the endpoint of the destination SRM

    • se_path: the path in the Storage Element, retrieved from the Information System (SAPath property)

    • storage_class [Optional]: the storage class for the file. In case this parameter is set into the Transfer Job request, and the se_path would already take into consideration this parameter (if the SA configured accordingly)

    This method should return string containg the generated SURL

The following section illustrate an example of configuration that use a python script as catalog connector

glite-transfer-vo-agent-EGEE.properties.xml (With Python Plugins)


<?xml version="1.0" encoding="UTF8"?>
<service>
  <components>
    <component name="agents-sd">
      <lib>libglite_data_agents_common_sd.so</lib>
    </component>
    <component name="agents-cred-myproxy">
      <lib>libglite_data_agents_common_cred_myproxy.so</lib>
    </component>
    <component name="transfer-agent-fsm">
      <lib>libglite_data_transfer_agent_fsm.so</lib>
    </component>
    <component name="transfer-agent-namegen-python">
      <lib>libglite_data_transfer_agent_namegen_python.so</lib>
      <init>
        <param name="NameGenModule">
          <value>@VO_NameGen_Script</value>
        </param>
        <param name="NameGenParams">
          <value>@SOME_VALUES</value>
        </param>
      </init>  
    </component>
    <component name="transfer-agent-catalog-python">
      <lib>libglite_data_transfer_agent_catalog_python.so</lib>
      <init>
        <param name="CatalogModule">
          <value>@VO_Catalog_PLUGIN</value>
        </param>
        <param name="CatalogParams">
          <value>@SOME_VALUES</value>
        </param>
      </init>  
    </component>
    <component name="agents-dao-oracle">
      <lib>libglite_data_agents_common_dao_oracle.so</lib>
      <init>
        <param name="ConnectString">
          <value>@DBNAME</value>
        </param>
        <param name="User">
          <value>@DBUSER</value>
        </param>
        <param name="Password">
          <value>@DBPASSWORD</value>
        </param>
      </init>
    </component>
    <component name="transfer-agent-dao-oracle">
      <lib>libglite_data_transfer_agent_dao_oracle.so</lib>
      <init>
        <param name="VOView">
          <value>true</value>
        </param>
        <param name="ChannelView">
          <value>false</value>
        </param>
      </init>
    </component>
    <component name="transfer-agent-scheduler">
      <lib>libglite_data_transfer_agent_scheduler.so</lib>
    </component>
    <component name="transfer-agent-python">
      <lib>libglite_data_transfer_agent_python.so</lib>
      <init>
        <param name="PythonPath">
            <value>/opt/glite/lib/python2.2/site-packages:
                /opt/glite/lib/python/glite/fts/strategies/</value>
        </param>
      </init>
    </component>
    <component name="transfer-agent-vo-actions">
      <lib>libglite_data_transfer_agent_vo_actions.so</lib>
    </component>
    <component name="transfer-agent-vo-actions-python">
      <lib>libglite_data_transfer_agent_vo_actions_python.so</lib>
      <init>
        <param name="RetryModule"> 
          <value>smarter_retry</value> 
        </param>
        <param name="RetryParams"> 
          <value>
            MaxFailures = 10 ;
            HoldEnabled  = true ;
            OverwriteFailedFiles = true ;
            OverwriteExistingFiles = false ;
            DefaultRetryDelay  = 60  ; 
            RetryDelayForTimeoutOnGet   = 300 ;
            RetryDelayForDestFileExists = 60 ;
          </value>
        </param>
      </init>
    </component>
    <component name="transfer-vo-agent">
      <lib>libglite_data_transfer_vo_agent.so</lib>
      <init>
        <param name="Name">
          <value>EGEE</value>
        </param>
        <param name="Retry_Type">
          <value>glite:PythonRetry</value> 
        </param>
      </init>
    </component>
  </components>
</service>