core package

Subpackages

Submodules

core.analytics module

class core.analytics.Analytics(_db)[source]

Bases: object

GLOBAL_ENABLED = None
LIBRARY_ENABLED = {}
collect_event(library, license_pool, event_type, time=None, **kwargs)[source]
classmethod is_configured(library)[source]

core.app_server module

Implement logic common to more than one of the Simplified applications.

class core.app_server.ComplaintController[source]

Bases: object

A controller to register complaints against objects.

register(license_pool, raw_data)[source]
class core.app_server.ErrorHandler(app, debug=False)[source]

Bases: object

handle(exception)[source]

Something very bad has happened. Notify the client.

class core.app_server.HeartbeatController[source]

Bases: object

HEALTH_CHECK_TYPE = 'application/vnd.health+json'
VERSION_FILENAME = '.version'
heartbeat(conf_class=None)[source]
class core.app_server.URNLookupController(_db)[source]

Bases: object

A controller for looking up OPDS entries for specific books, identified in terms of their Identifier URNs.

Look up a single identifier and generate an OPDS feed.

TODO: This method is tested, but it seems unused and it should be possible to remove it.

process_urns(urns, **process_urn_kwargs)[source]

Process a number of URNs by instantiating a URNLookupHandler and having it do the work.

The information gathered by the URNLookupHandler can be used by the caller to generate an OPDS feed.

Returns:

A URNLookupHandler, or a ProblemDetail if there’s a problem with the request.

work_lookup(annotator, route_name='lookup', **process_urn_kwargs)[source]

Generate an OPDS feed describing works identified by identifier.

class core.app_server.URNLookupHandler(_db)[source]

Bases: object

A helper for URNLookupController that takes URNs as input and looks up their OPDS entries.

This is a separate class from URNLookupController because URNLookupController is designed to not keep state.

UNRECOGNIZED_IDENTIFIER = 'This work is not in the collection.'
WORK_NOT_CREATED = 'Identifier resolved but work not yet created.'
WORK_NOT_PRESENTATION_READY = 'Work created but not yet presentation-ready.'
add_entry(entry)[source]

An identifier lookup succeeded in creating an OPDS entry.

add_message(urn, status_code, message)[source]

An identifier lookup resulted in the creation of a message.

add_urn_failure_messages(failures)[source]
add_work(identifier, work)[source]

An identifier lookup succeeded in finding a Work.

post_lookup_hook()[source]

Run after looking up a number of Identifiers.

By default, does nothing.

process_identifier(identifier, urn, **kwargs)[source]

Turn a URN into a Work suitable for use in an OPDS feed.

process_urns(urns, **process_urn_kwargs)[source]

Processes a list of URNs for a lookup request.

Returns:

None or, to override default feed behavior, a ProblemDetail or Response.

core.app_server.cdn_url_for(*args, **kwargs)[source]
core.app_server.compressible(f)[source]

Decorate a function to make it transparently handle whatever compression the client has announced it supports.

Currently the only form of compression supported is representation-level gzip compression requested through the Accept-Encoding header.

This code was modified from http://kb.sites.apiit.edu.my/knowledge-base/how-to-gzip-response-in-flask/, though I don’t know if that’s the original source; it shows up in a lot of places.

core.app_server.load_facets_from_request(facet_config=None, worklist=None, base_class=<class 'core.lane.Facets'>, base_class_constructor_kwargs=None, default_entrypoint=None)[source]

Figure out which faceting object this request is asking for.

The active request must have the library member set to a Library object.

Parameters:
  • worklist – The WorkList, if any, associated with the request.

  • facet_config – An object containing the currently configured facet groups, if different from the request library.

  • base_class – The faceting class to instantiate.

  • base_class_constructor_kwargs – Keyword arguments to pass into the faceting class constructor, other than those obtained from the request.

Returns:

A faceting object if possible; otherwise a ProblemDetail.

core.app_server.load_pagination_from_request(base_class=<class 'core.lane.Pagination'>, base_class_constructor_kwargs=None, default_size=None)[source]

Figure out which Pagination object this request is asking for.

Parameters:
  • base_class – A subclass of Pagination to instantiate.

  • base_class_constructor_kwargs – Extra keyword arguments to use when instantiating the Pagination subclass.

  • default_size – The default page size.

Returns:

An instance of base_class.

core.app_server.returns_problem_detail(f)[source]

core.cdn module

Turn local URLs into CDN URLs.

core.cdn.cdnify(url, cdns=None)[source]

Turn local URLs into CDN URLs

core.config module

exception core.config.CannotLoadConfiguration(message, debug_message=None)[source]

Bases: IntegrationException

The current configuration of an external integration, or of the site as a whole, is in an incomplete or inconsistent state.

This is more specific than a base IntegrationException because it assumes the problem is evident just by looking at the current configuration, with no need to actually talk to the foreign server.

class core.config.Configuration[source]

Bases: ConfigurationConstants

ALLOW_HOLDS = 'allow_holds'
ANALYTICS_POLICY = 'analytics'
APP_VERSION = 'app_version'
AXIS_INTEGRATION = 'Axis 360'
BASE_URL_KEY = 'base_url'
CDNS_LOADED_FROM_DATABASE = 'loaded_from_database'
CDN_MIRRORED_DOMAIN_KEY = 'mirrored_domain'
CONTENT_SERVER_INTEGRATION = 'Content Server'
DATABASE_INTEGRATION = 'Postgres'
DATABASE_LOG_LEVEL = 'database_log_level'
DATABASE_PRODUCTION_ENVIRONMENT_VARIABLE = 'SIMPLIFIED_PRODUCTION_DATABASE'
DATABASE_PRODUCTION_URL = 'production_url'
DATABASE_TEST_ENVIRONMENT_VARIABLE = 'SIMPLIFIED_TEST_DATABASE'
DATABASE_TEST_URL = 'test_url'
DATA_DIRECTORY = 'data_directory'
DEBUG = 'DEBUG'
DEFAULT_APP_NAME = 'simplified'
DEFAULT_OPDS_FORMAT = 'simple_opds_entry'
ERROR = 'ERROR'
EXCLUDED_AUDIO_DATA_SOURCES = 'excluded_audio_data_sources'
EXTERNAL_TYPE_REGULAR_EXPRESSION = 'external_type_regular_expression'
FEATURED_LANE_SIZE = 'featured_lane_size'
INFO = 'INFO'
INTEGRATIONS = 'integrations'
LANES_POLICY = 'lanes'
LAST_CHECKED_FOR_SITE_CONFIGURATION_UPDATE = 'last_checked_for_site_configuration_update'
LIBRARY_SETTINGS = [{'key': 'name', 'label': l'Name', 'description': l'The human-readable name of this library.', 'category': 'Basic Information', 'level': 3, 'required': True}, {'key': 'short_name', 'label': l'Short name', 'description': l'A short name of this library, to use when identifying it in scripts or URLs, e.g. 'NYPL'.', 'category': 'Basic Information', 'level': 3, 'required': True}, {'key': 'website', 'label': l'URL of the library's website', 'description': l'The library's main website, e.g. "https://www.nypl.org/" (not this Circulation Manager's URL).', 'required': True, 'format': 'url', 'level': 3, 'category': 'Basic Information'}, {'key': 'allow_holds', 'label': l'Allow books to be put on hold', 'type': 'select', 'options': [{'key': 'true', 'label': l'Allow holds'}, {'key': 'false', 'label': l'Disable holds'}], 'default': 'true', 'category': 'Loans, Holds, & Fines', 'level': 3}, {'key': 'enabled_entry_points', 'label': l'Enabled entry points', 'description': l'Patrons will see the selected entry points at the top level and in search results. <p>Currently supported audiobook vendors: Bibliotheca, Axis 360', 'type': 'list', 'options': [{'key': 'All', 'label': 'All'}, {'key': 'Book', 'label': 'eBooks'}, {'key': 'Audio', 'label': 'Audiobooks'}], 'default': ['Book'], 'category': 'Lanes & Filters', 'format': 'narrow', 'readOnly': True, 'level': 3}, {'key': 'featured_lane_size', 'label': l'Maximum number of books in the 'featured' lanes', 'type': 'number', 'default': 15, 'category': 'Lanes & Filters', 'level': 1}, {'key': 'minimum_featured_quality', 'label': l'Minimum quality for books that show up in 'featured' lanes', 'description': l'Between 0 and 1.', 'type': 'number', 'max': 1, 'default': 0.65, 'category': 'Lanes & Filters', 'level': 1}, {'key': 'facets_enabled_order', 'label': l'Allow patrons to sort by', 'type': 'list', 'options': [{'key': 'title', 'label': l'Title'}, {'key': 'author', 'label': l'Author'}, {'key': 'added', 'label': l'Recently Added'}, {'key': 'random', 'label': l'Random'}, {'key': 'relevance', 'label': l'Relevance'}], 'default': ['title', 'author', 'added', 'random', 'relevance'], 'category': 'Lanes & Filters', 'paired': 'facets_default_order', 'level': 2}, {'key': 'facets_enabled_available', 'label': l'Allow patrons to filter availability to', 'type': 'list', 'options': [{'key': 'now', 'label': l'Available now'}, {'key': 'all', 'label': l'All'}, {'key': 'always', 'label': l'Yours to keep'}], 'default': ['now', 'all', 'always'], 'category': 'Lanes & Filters', 'paired': 'facets_default_available', 'level': 2}, {'key': 'facets_enabled_collection', 'label': l'Allow patrons to filter collection to', 'type': 'list', 'options': [{'key': 'full', 'label': l'Everything'}, {'key': 'featured', 'label': l'Popular Books'}], 'default': ['full', 'featured'], 'category': 'Lanes & Filters', 'paired': 'facets_default_collection', 'level': 2}, {'key': 'facets_default_order', 'label': l'Default Sort by', 'type': 'select', 'options': [{'key': 'title', 'label': l'Title'}, {'key': 'author', 'label': l'Author'}, {'key': 'added', 'label': l'Recently Added'}, {'key': 'random', 'label': l'Random'}, {'key': 'relevance', 'label': l'Relevance'}], 'default': 'author', 'category': 'Lanes & Filters', 'skip': True}, {'key': 'facets_default_available', 'label': l'Default Availability', 'type': 'select', 'options': [{'key': 'now', 'label': l'Available now'}, {'key': 'all', 'label': l'All'}, {'key': 'always', 'label': l'Yours to keep'}], 'default': 'all', 'category': 'Lanes & Filters', 'skip': True}, {'key': 'facets_default_collection', 'label': l'Default Collection', 'type': 'select', 'options': [{'key': 'full', 'label': l'Everything'}, {'key': 'featured', 'label': l'Popular Books'}], 'default': 'full', 'category': 'Lanes & Filters', 'skip': True}]
LOCALIZATION_LANGUAGES = 'localization_languages'
LOGGING = 'logging'
LOGGING_FORMAT = 'format'
LOGGING_LEVEL = 'level'
LOG_APP_NAME = 'log_app'
LOG_DATA_FORMAT = 'format'
LOG_FORMAT_JSON = 'json'
LOG_FORMAT_TEXT = 'text'
LOG_LEVEL = 'log_level'
LOG_LEVEL_UI = [{'key': 'DEBUG', 'label': l'Debug'}, {'key': 'INFO', 'label': l'Info'}, {'key': 'WARN', 'label': l'Warn'}, {'key': 'ERROR', 'label': l'Error'}]
LOG_OUTPUT_TYPE = 'output'
MEASUREMENT_REAPER = 'measurement_reaper_enabled'
NAME = 'name'
NO_APP_VERSION_FOUND = <object object>
OVERDRIVE_INTEGRATION = 'Overdrive'
POLICIES = 'policies'
RBDIGITAL_INTEGRATION = 'RBDigital'
SHORT_NAME = 'short_name'
SITEWIDE_SETTINGS = [{'key': 'base_url', 'label': l'Base url of the application', 'required': True, 'format': 'url'}, {'key': 'log_level', 'label': l'Log Level', 'type': 'select', 'options': [{'key': 'DEBUG', 'label': l'Debug'}, {'key': 'INFO', 'label': l'Info'}, {'key': 'WARN', 'label': l'Warn'}, {'key': 'ERROR', 'label': l'Error'}], 'default': 'INFO'}, {'key': 'log_app', 'label': l'Application name', 'description': l'Log messages originating from this application will be tagged with this name. If you run multiple instances, giving each one a different application name will help you determine which instance is having problems.', 'default': 'simplified', 'required': True}, {'key': 'database_log_level', 'label': l'Database Log Level', 'type': 'select', 'options': [{'key': 'DEBUG', 'label': l'Debug'}, {'key': 'INFO', 'label': l'Info'}, {'key': 'WARN', 'label': l'Warn'}, {'key': 'ERROR', 'label': l'Error'}], 'description': l'Database logs are extremely verbose, so unless you're diagnosing a database-related problem, it's a good idea to set a higher log level for database messages.', 'default': 'WARN'}, {'key': 'excluded_audio_data_sources', 'label': l'Excluded audiobook sources', 'description': l'Audiobooks from these data sources will be hidden from the collection, even if they would otherwise show up as available.', 'default': None, 'required': True}, {'key': 'measurement_reaper_enabled', 'label': l'Cleanup old measurement data', 'type': 'select', 'description': l'If this settings is 'true' old book measurement data will be cleaned out of the database. Some sites may want to keep this data for later analysis.', 'options': {'true': 'true', 'false': 'false'}, 'default': 'true'}]
SITE_CONFIGURATION_CHANGED = 'Site Configuration Changed'
SITE_CONFIGURATION_LAST_UPDATE = 'site_configuration_last_update'
SITE_CONFIGURATION_TIMEOUT = 'site_configuration_timeout'
THREEM_INTEGRATION = '3M'
TYPE = 'type'
URL = 'url'
VERSION_FILENAME = '.version'
WARN = 'WARN'
WEBSITE_URL = 'website'
classmethod app_version()[source]

Returns the git version of the app, if a .version file exists.

classmethod cdns()[source]

Get CDN configuration, loading it from the database if necessary.

classmethod cdns_loaded_from_database()[source]

Has the site configuration been loaded from the database yet?

classmethod data_directory()[source]
classmethod database_url()[source]

Find the database URL configured for this site.

For compatibility with old configurations, we will look in the site configuration first.

If it’s not there, we will look in the appropriate environment variable.

classmethod get(key, default=None)[source]
instance = {}
classmethod integration(name, required=False)[source]

Find an integration configuration by name.

classmethod integration_url(name, required=False)[source]

Find the URL to an integration.

classmethod last_checked_for_site_configuration_update()[source]

When was the last time we actually checked when the database was updated?

classmethod load(_db=None)[source]

Load configuration information from the filesystem, and (optionally) from the database.

classmethod load_cdns(_db, config_instance=None)[source]
classmethod load_from_file()[source]

Load additional site configuration from a config file.

This is being phased out in favor of taking all configuration from a database.

classmethod localization_languages()[source]
log = <Logger Configuration file loader (WARNING)>
classmethod policy(name, default=None, required=False)[source]

Find a policy configuration by name.

classmethod required(key)[source]
classmethod site_configuration_last_update(_db, known_value=None, timeout=0)[source]

Check when the site configuration was last updated.

Updates Configuration.instance[Configuration.SITE_CONFIGURATION_LAST_UPDATE]. It’s the application’s responsibility to periodically check this value and reload the configuration if appropriate.

Parameters:
  • known_value – We know when the site configuration was last updated–it’s this timestamp. Use it instead of checking with the database.

  • timeout – We will only call out to the database once in this number of seconds. If we are asked again before this number of seconds elapses, we will assume site configuration has not changed. By default, we call out to the database every time.

Returns:

a datetime object.

classmethod static_resources_dir()[source]

Locate the static resources for this installation.

Default location is /simplified_static. To use a different location, set the value of the SIMPLIFIED_STATIC_DIR environment variable.

class core.config.ConfigurationConstants[source]

Bases: object

ALL_ACCESS = 1
DEFAULT_FACET_KEY_PREFIX = 'facets_default_'
ENABLED_FACETS_KEY_PREFIX = 'facets_enabled_'
SYS_ADMIN_ONLY = 3
SYS_ADMIN_OR_MANAGER = 2
core.config.empty_config(replacement_classes=None)[source]
core.config.temp_config(new_config=None, replacement_classes=None)[source]

core.coverage module

class core.coverage.BaseCoverageProvider(_db, batch_size=None, cutoff_time=None, registered_only=False)[source]

Bases: object

Run certain objects through an algorithm. If the algorithm returns success, add a coverage record for that object, so the object doesn’t need to be processed again. If the algorithm returns a CoverageFailure, that failure may itself be memorialized as a coverage record.

Instead of instantiating this class directly, subclass one of its subclasses: either IdentifierCoverageProvider or WorkCoverageProvider.

In IdentifierCoverageProvider the ‘objects’ are Identifier objects and the coverage records are CoverageRecord objects. In WorkCoverageProvider the ‘objects’ are Work objects and the coverage records are WorkCoverageRecord objects.

DEFAULT_BATCH_SIZE = 100
OPERATION = None
SERVICE_NAME = None
add_coverage_record_for(item)[source]

Add a coverage record for the given item.

Implemented in IdentifierCoverageProvider and WorkCoverageProvider.

add_coverage_records_for(items)[source]

Add CoverageRecords for a group of items from a batch, each of which was successful.

property collection

Retrieve the Collection object associated with this CoverageProvider.

failure_for_ignored_item(work)[source]

Create a CoverageFailure recording the coverage provider’s failure to even try to process an item.

Implemented in IdentifierCoverageProvider and WorkCoverageProvider.

finalize_batch()[source]

Do whatever is necessary to complete this batch before moving on to the next one.

e.g. committing the database session or uploading a bunch of assets to S3.

finalize_timestampdata(timestamp, **kwargs)[source]

Finalize the given TimestampData and write it to the database.

handle_success(item)[source]

Do something special to mark the successful coverage of the given item.

items_that_need_coverage(identifiers=None, **kwargs)[source]

Create a database query returning only those items that need coverage.

Parameters:

subset – A list of Identifier objects. If present, return only items that need coverage and are associated with one of these identifiers.

Implemented in CoverageProvider and WorkCoverageProvider.

property log
property operation

Which operation should this CoverageProvider use to distinguish between multiple CoverageRecords from the same data source?

process_batch(batch)[source]

Do what it takes to give coverage records to a batch of items.

Returns:

A mixed list of coverage records and CoverageFailures.

process_batch_and_handle_results(batch)[source]
Returns:

A 2-tuple (counts, records).

counts is a 3-tuple (successes, transient failures, persistent_failures).

records is a mixed list of coverage record objects (for successes and persistent failures) and CoverageFailure objects (for transient failures).

process_item(item)[source]

Do the work necessary to give coverage to one specific item.

Since this is where the actual work happens, this is not implemented in IdentifierCoverageProvider or WorkCoverageProvider, and must be handled in a subclass.

record_failure_as_coverage_record(failure)[source]

Convert the given CoverageFailure to a coverage record.

Implemented in IdentifierCoverageProvider and WorkCoverageProvider.

run()[source]
run_once(progress, count_as_covered=None)[source]

Try to grant coverage to a number of uncovered items.

NOTE: If you override this method, it’s very important that your implementation eventually do one of the following: * Set progress.finish * Set progress.exception * Raise an exception

If you don’t do any of these things, run() will assume you still have work to do, and will keep calling run_once() forever.

Parameters:
  • progress – A CoverageProviderProgress representing the progress made so far, and the number of records that need to be ignored for the rest of the run.

  • count_as_covered – Which values for CoverageRecord.status should count as meaning ‘already covered’.

Returns:

A CoverageProviderProgress representing whatever additional progress has been made.

run_once_and_update_timestamp()[source]
should_update(coverage_record)[source]

Should we do the work to update the given coverage record?

property timestamp

Look up the Timestamp object for this CoverageProvider.

class core.coverage.BibliographicCoverageProvider(collection, **kwargs)[source]

Bases: CollectionCoverageProvider

Fill in bibliographic metadata for all books in a Collection.

e.g. ensures that we get Overdrive coverage for all Overdrive IDs in a collection.

Although a BibliographicCoverageProvider may gather CirculationData for a book, it cannot guarantee equal coverage for all Collections that contain that book. CirculationData should be limited to things like formats that don’t vary between Collections, and you should use a CollectionMonitor to make sure your circulation information is up-to-date for each Collection.

handle_success(identifier)[source]

Once a book has bibliographic coverage, it can be given a work and made presentation ready.

class core.coverage.CatalogCoverageProvider(collection, **kwargs)[source]

Bases: CollectionCoverageProvider

Most CollectionCoverageProviders provide coverage to Identifiers that are licensed through a given Collection.

A CatalogCoverageProvider provides coverage to Identifiers that are present in a given Collection’s catalog.

items_that_need_coverage(identifiers=None, **kwargs)[source]

Find all Identifiers in this Collection’s catalog but lacking coverage through this CoverageProvider.

class core.coverage.CollectionCoverageProvider(collection, **kwargs)[source]

Bases: IdentifierCoverageProvider

A CoverageProvider that covers all the Identifiers currently licensed to a given Collection.

You should subclass this CoverageProvider if you want to create Works (as opposed to operating on existing Works) or update the circulation information for LicensePools. You can’t use it to create new LicensePools, since it only operates on Identifiers that already have a LicencePool in the given Collection.

If a book shows up in multiple Collections, the first Collection to process it takes care of it for the others. Any books that were processed through their membership in another Collection will be left alone.

For this reason it’s important that subclasses of this CoverageProvider only deal with bibliographic information and format availability information (such as links to open-access downloads). You’ll have problems if you try to use CollectionCoverageProvider to keep track of information like the number of licenses available for a book.

In addition to defining the class variables defined by CoverageProvider, you must define the class variable PROTOCOL when subclassing this class. This is the entity that provides the licenses for this Collection. It should be one of the collection-type provider constants defined in the ExternalIntegration class, such as ExternalIntegration.OPDS_IMPORT or ExternalIntegration.OVERDRIVE.

DEFAULT_BATCH_SIZE = 10
EXCLUDE_SEARCH_INDEX = False
INPUT_IDENTIFIER_TYPES = None
PROTOCOL = None
classmethod all(_db, **kwargs)[source]

Yield a sequence of CollectionCoverageProvider instances, one for every Collection that gets its licenses from cls.PROTOCOL.

CollectionCoverageProviders will be yielded in a random order.

Parameters:

kwargs – Keyword arguments passed into the constructor for CollectionCoverageProvider (or, more likely, one of its subclasses).

classmethod collections(_db)[source]

Returns a list of randomly sorted list of collections covered by the provider.

items_that_need_coverage(identifiers=None, **kwargs)[source]

Find all Identifiers associated with this Collection but lacking coverage through this CoverageProvider.

license_pool(identifier, data_source=None)[source]

Finds this Collection’s LicensePool for the given Identifier, creating one if necessary.

Parameters:

data_source – If it’s necessary to create a LicensePool, the new LicensePool will have this DataSource. The default is to use the DataSource associated with the CoverageProvider. This should only be needed by the metadata wrangler.

run_once(*args, **kwargs)[source]

Try to grant coverage to a number of uncovered items.

NOTE: If you override this method, it’s very important that your implementation eventually do one of the following: * Set progress.finish * Set progress.exception * Raise an exception

If you don’t do any of these things, run() will assume you still have work to do, and will keep calling run_once() forever.

Parameters:
  • progress – A CoverageProviderProgress representing the progress made so far, and the number of records that need to be ignored for the rest of the run.

  • count_as_covered – Which values for CoverageRecord.status should count as meaning ‘already covered’.

Returns:

A CoverageProviderProgress representing whatever additional progress has been made.

set_metadata_and_circulation_data(identifier, metadata, circulationdata)[source]

Makes sure that the given Identifier has a Work, Edition (in the context of this Collection), and LicensePool (ditto), and that all the information is up to date.

Returns:

The Identifier (if successful) or an appropriate CoverageFailure (if not).

set_presentation_ready(identifier)[source]

Set a Work presentation-ready.

work(identifier, license_pool=None, **calculate_work_kwargs)[source]

Finds or creates a Work for this Identifier as licensed through this Collection.

If the given Identifier already has a Work associated with it, that Work will always be used, since an Identifier can only have one Work associated with it.

However, if there is no current Work, a Work will only be created if the given Identifier already has a LicensePool in the Collection associated with this CoverageProvider (or if a LicensePool to use is provided.) This method will not create new LicensePools.

If the work is newly created or an existing work is not presentation-ready, a new Work will be created by calling LicensePool.calculate_work(). If there is an existing presentation-ready work, calculate_work() will not be called; instead, the work will be slated for recalculation when its metadata changes through Metadata.apply().

Parameters:

calculate_work_kwargs – Keyword arguments to pass into calculate_work() if and when it is called.

Returns:

A Work, if possible. Otherwise, a CoverageFailure explaining why no Work could be created.

class core.coverage.CollectionCoverageProviderJob(collection, provider_class, progress, **provider_kwargs)[source]

Bases: DatabaseJob

run(_db, **kwargs)[source]
class core.coverage.CoverageFailure(obj, exception, data_source=None, transient=True, collection=None)[source]

Bases: object

Object representing the failure to provide coverage.

to_coverage_record(operation=None)[source]

Convert this failure into a CoverageRecord.

to_work_coverage_record(operation)[source]

Convert this failure into a WorkCoverageRecord.

class core.coverage.CoverageProviderProgress(*args, **kwargs)[source]

Bases: TimestampData

A TimestampData optimized for the special needs of CoverageProviders.

property achievements

Represent the achievements of a CoverageProvider as a human-readable string.

class core.coverage.IdentifierCoverageProvider(_db, collection=None, input_identifiers=None, replacement_policy=None, **kwargs)[source]

Bases: BaseCoverageProvider

Run Identifiers of certain types (ISBN, Overdrive, OCLC Number, etc.) through an algorithm associated with a certain DataSource.

This class is designed to be subclassed rather than instantiated directly. Subclasses should define SERVICE_NAME, OPERATION (optional), DATA_SOURCE_NAME, and INPUT_IDENTIFIER_TYPES. SERVICE_NAME and OPERATION are described in BaseCoverageProvider; the rest are described in appropriate comments in this class.

COVERAGE_COUNTS_FOR_EVERY_COLLECTION = True
DATA_SOURCE_NAME = None
INPUT_IDENTIFIER_TYPES = <object object>
NO_SPECIFIED_TYPES = <object object>
add_coverage_record_for(item)[source]

Record this CoverageProvider’s coverage for the given Edition/Identifier, as a CoverageRecord.

classmethod bulk_register(identifiers, data_source=None, collection=None, force=False, autocreate=False)[source]

Registers identifiers for future coverage.

This method is primarily for use with CoverageProviders that use the registered_only flag to process items. It’s currently only in use on the Metadata Wrangler.

Parameters:
  • data_source – DataSource object or basestring representing a DataSource name.

  • collection – Collection object to be associated with the CoverageRecords.

  • force – When True, even existing CoverageRecords will have their status reset to CoverageRecord.REGISTERED.

  • autocreate – When True, a basestring provided by data_source will be autocreated in the database if it didn’t previously exist.

Returns:

A tuple of two lists: the first has fresh new REGISTERED CoverageRecords and the second list already has Identifiers that were ignored because they already had coverage.

TODO: Take identifier eligibility into account when registering.

can_cover(identifier)[source]

Can this IdentifierCoverageProvider do anything with the given Identifier?

This is not needed in the normal course of events, but a caller may need to decide whether to pass an Identifier into ensure_coverage() or register().

property collection_or_not

If this CoverageProvider needs to be run multiple times on the same identifier in different collections, this returns the collection. Otherwise, this returns None.

property data_source

Look up the DataSource object corresponding to the service we’re running this data through.

Out of an excess of caution, we look up the DataSource every time, rather than storing it, in case a CoverageProvider is ever used in an environment where the database session is scoped (e.g. the circulation manager).

edition(identifier)[source]

Finds or creates an Edition representing this coverage provider’s view of a given Identifier.

ensure_coverage(item, force=False)[source]

Ensure coverage for one specific item.

Parameters:
  • item – This should always be an Identifier, but this code will also work if it’s an Edition. (The Edition’s .primary_identifier will be covered.)

  • force – Run the coverage code even if an existing coverage record for this item was created after self.cutoff_time.

Returns:

Either a coverage record or a CoverageFailure.

TODO: This could be abstracted and moved to BaseCoverageProvider.

failure(identifier, error, transient=True)[source]

Create a CoverageFailure object to memorialize an error.

failure_for_ignored_item(item)[source]

Create a CoverageFailure recording the CoverageProvider’s failure to even try to process an item.

items_that_need_coverage(identifiers=None, **kwargs)[source]

Find all items lacking coverage from this CoverageProvider.

Items should be Identifiers, though Editions should also work.

By default, all identifiers of the INPUT_IDENTIFIER_TYPES which don’t already have coverage are chosen.

Parameters:

identifiers – The batch of identifier objects to test for coverage. identifiers and self.input_identifiers can intersect – if this provider was created for the purpose of running specific Identifiers, and within those Identifiers you want to batch, you can use both parameters.

record_failure_as_coverage_record(failure)[source]

Turn a CoverageFailure into a CoverageRecord object.

classmethod register(identifier, data_source=None, collection=None, force=False, autocreate=False)[source]

Registers an identifier for future coverage.

See CoverageProvider.bulk_register for more information about using this method.

run_on_specific_identifiers(identifiers)[source]

Split a specific set of Identifiers into batches and process one batch at a time.

This is for use by IdentifierInputScript.

Returns:

The same (counts, records) 2-tuple as process_batch_and_handle_results.

set_metadata(identifier, metadata)[source]

Finds or creates the Edition for an Identifier, updates it with the given metadata.

Returns:

The Identifier (if successful) or an appropriate CoverageFailure (if not).

class core.coverage.MARCRecordWorkCoverageProvider(_db, batch_size=None, cutoff_time=None, registered_only=False)[source]

Bases: WorkPresentationProvider

Make sure all presentation-ready works have an up-to-date MARC record.

DEFAULT_BATCH_SIZE = 1000
OPERATION = 'generate-marc'
SERVICE_NAME = 'MARC Record Work Coverage Provider'
process_item(work)[source]

Do the work necessary to give coverage to one specific item.

Since this is where the actual work happens, this is not implemented in IdentifierCoverageProvider or WorkCoverageProvider, and must be handled in a subclass.

class core.coverage.OPDSEntryWorkCoverageProvider(_db, batch_size=None, cutoff_time=None, registered_only=False)[source]

Bases: WorkPresentationProvider

Make sure all presentation-ready works have an up-to-date OPDS entry.

This is different from the OPDSEntryCacheMonitor, which sweeps over all presentation-ready works, even ones which are already covered.

DEFAULT_BATCH_SIZE = 1000
OPERATION = 'generate-opds'
SERVICE_NAME = 'OPDS Entry Work Coverage Provider'
process_item(work)[source]

Do the work necessary to give coverage to one specific item.

Since this is where the actual work happens, this is not implemented in IdentifierCoverageProvider or WorkCoverageProvider, and must be handled in a subclass.

class core.coverage.PresentationReadyWorkCoverageProvider(_db, batch_size=None, cutoff_time=None, registered_only=False)[source]

Bases: WorkCoverageProvider

A WorkCoverageProvider that only covers presentation-ready works.

items_that_need_coverage(identifiers=None, **kwargs)[source]

Find all Works lacking coverage from this CoverageProvider.

By default, all Works which don’t already have coverage are chosen.

Param:

Only Works connected with one of the given identifiers are chosen.

class core.coverage.WorkClassificationCoverageProvider(_db, batch_size=None, cutoff_time=None, registered_only=False)[source]

Bases: WorkPresentationEditionCoverageProvider

Calculates the ‘expensive’ parts of a work’s presentation: classifications, summary, and quality.

We do all three at once because these gathering together all equivalent identifiers for the work, which can be, by far, the most expensive part of the work.

This is called ‘classification’ because that’s the most likely use of this coverage provider. If you want to make sure a bunch of works get their summaries recalculated, you need to remember that the coverage record to delete is CLASSIFY_OPERATION.

DEFAULT_BATCH_SIZE = 20
OPERATION = 'classify'
POLICY = <core.model.PresentationCalculationPolicy object>
SERVICE_NAME = 'Work classification coverage provider'
class core.coverage.WorkCoverageProvider(_db, batch_size=None, cutoff_time=None, registered_only=False)[source]

Bases: BaseCoverageProvider

Perform coverage operations on Works rather than Identifiers.

add_coverage_record_for(work)[source]

Record this CoverageProvider’s coverage for the given Edition/Identifier, as a WorkCoverageRecord.

add_coverage_records_for(works)[source]

Add WorkCoverageRecords for a group of works from a batch, each of which was successful.

failure(work, error, transient=True)[source]

Create a CoverageFailure object.

failure_for_ignored_item(work)[source]

Create a CoverageFailure recording the WorkCoverageProvider’s failure to even try to process a Work.

items_that_need_coverage(identifiers=None, **kwargs)[source]

Find all Works lacking coverage from this CoverageProvider.

By default, all Works which don’t already have coverage are chosen.

Param:

Only Works connected with one of the given identifiers are chosen.

record_failure_as_coverage_record(failure)[source]

Turn a CoverageFailure into a WorkCoverageRecord object.

classmethod register(work, force=False)[source]

Registers a work for future coverage.

This method is primarily for use with CoverageProviders that use the registered_only flag to process items. It’s currently only in use on the Metadata Wrangler.

Parameters:

force – Set to True to reset an existing CoverageRecord’s status “registered”, regardless of its current status.

class core.coverage.WorkPresentationEditionCoverageProvider(_db, batch_size=None, cutoff_time=None, registered_only=False)[source]

Bases: WorkPresentationProvider

Make sure each Work has an up-to-date presentation edition.

This basically means comparing all the Editions associated with the Work and building a composite Edition.

Expensive operations – calculating work quality, summary, and genre classification – are reserved for WorkClassificationCoverageProvider

OPERATION = 'choose-edition'
POLICY = <core.model.PresentationCalculationPolicy object>
SERVICE_NAME = 'Calculated presentation coverage provider'
process_item(work)[source]

Recalculate the presentation for a Work.

class core.coverage.WorkPresentationProvider(_db, batch_size=None, cutoff_time=None, registered_only=False)[source]

Bases: PresentationReadyWorkCoverageProvider

Recalculate some part of presentation for works that are presentation-ready.

A Work’s presentation is set when it’s made presentation-ready (thus the name). When that happens, a number of WorkCoverageRecords are set for that Work.

A migration script may remove a coverage record if it knows a work needs to have some aspect of its presentation recalculated. These providers give back the ‘missing’ coverage.

DEFAULT_BATCH_SIZE = 100

core.entrypoint module

class core.entrypoint.AudiobooksEntryPoint[source]

Bases: MediumEntryPoint

INTERNAL_NAME = 'Audio'
URI = 'http://bib.schema.org/Audiobook'
class core.entrypoint.EbooksEntryPoint[source]

Bases: MediumEntryPoint

INTERNAL_NAME = 'Book'
URI = 'http://schema.org/EBook'
class core.entrypoint.EntryPoint[source]

Bases: object

A EntryPoint is a top-level entry point into a library’s Lane structure that may apply additional filters to the Lane structure.

The “Books” and “Audiobooks” entry points (defined in the EbooksEntryPoint and AudiobooksEntryPoint classes) are different views on a library’s Lane structure; each applies an additional filter against Edition.medium.

Each individual EntryPoint should be represented as a subclass of EntryPoint, and should be registered with the overall EntryPoint class by calling EntryPoint.register.

The list of entry points shows up as a facet group in a library’s top-level grouped feed, and in search results. The SimplyE client renders entry points as a set of tabs.

BY_INTERNAL_NAME = {'All': <class 'core.entrypoint.EverythingEntryPoint'>, 'Audio': <class 'core.entrypoint.AudiobooksEntryPoint'>, 'Book': <class 'core.entrypoint.EbooksEntryPoint'>}
DEFAULT_ENABLED = [<class 'core.entrypoint.EbooksEntryPoint'>]
DISPLAY_TITLES = {<class 'core.entrypoint.EverythingEntryPoint'>: 'All', <class 'core.entrypoint.EbooksEntryPoint'>: 'eBooks', <class 'core.entrypoint.AudiobooksEntryPoint'>: 'Audiobooks'}
ENABLED_SETTING = 'enabled_entry_points'
ENTRY_POINTS = [<class 'core.entrypoint.EverythingEntryPoint'>, <class 'core.entrypoint.EbooksEntryPoint'>, <class 'core.entrypoint.AudiobooksEntryPoint'>]
URI = None
classmethod modify_database_query(_db, qu)[source]

If necessary, modify a database query so that it restricts results to items shown through this entry point.

The default behavior is to not change a database query at all.

classmethod modify_search_filter(filter)[source]

If necessary, modify an ElasticSearch Filter object so that it restricts results to items shown through this entry point.

The default behavior is not to change the Filter object at all.

Parameters:

filter – An external_search.Filter object.

classmethod register(entrypoint_class, display_title, default_enabled=False)[source]

Register the given subclass with the master registry kept in the EntryPoint class.

Parameters:
  • entrypoint_class – A subclass of EntryPoint.

  • display_title – The title to use when displaying this entry point to patrons.

  • default_enabled – New libraries should have this entry point enabled by default.

classmethod unregister(entrypoint_class)[source]

Undo a subclass’s registration.

Only used in tests.

class core.entrypoint.EverythingEntryPoint[source]

Bases: EntryPoint

An entry point that has everything.

INTERNAL_NAME = 'All'
URI = 'http://schema.org/CreativeWork'
class core.entrypoint.MediumEntryPoint[source]

Bases: EntryPoint

A entry point that creates a view on one specific medium.

The medium is expected to be the entry point’s INTERNAL_NAME.

The URI is expected to be the one in Edition.schema_to_additional_type[INTERNAL_NAME]

classmethod modify_database_query(_db, qu)[source]

Modify a query against Work+LicensePool+Edition to match only items with the right medium.

classmethod modify_search_filter(filter)[source]

Modify an external_search.Filter object so it only finds titles available through this EntryPoint.

Parameters:

filter – An external_search.Filter object.

core.exceptions module

exception core.exceptions.BaseError(message=None, inner_exception=None)[source]

Bases: Exception

Base class for all errors

property inner_exception

Returns an inner exception

Returns:

Inner exception

Return type:

Exception

core.external_list module

class core.external_list.ClassificationBasedMembershipManager(custom_list, subject_fragments)[source]

Bases: MembershipManager

Manage a custom list containing all Editions whose primary Identifier is classified under one of the given subject fragments.

property new_membership

Iterate over the new membership of the list.

Yield:

a sequence of Edition objects

class core.external_list.CustomListFromCSV(data_source_name, list_name, metadata_client=None, overwrite_old_data=False, annotation_field='text', annotation_author_name_field='name', annotation_author_affiliation_field='location', first_appearance_field='timestamp', **kwargs)[source]

Bases: CSVMetadataImporter

Create a CustomList, with entries, from a CSV file.

annotation_citation(row)[source]

Extract a citation for an annotation from a row of a CSV file.

metadata_to_list_entry(custom_list, data_source, now, metadata)[source]

Convert a Metadata object to a CustomListEntry.

metadata_to_title(now, metadata)[source]

Convert a Metadata object to a TitleFromExternalList object.

to_customlist(_db, dictreader)[source]

Turn the CSV file in dictreader into a CustomList.

TODO: Keep track of the list’s current members. If any item was on the list but is no longer on the list, set its last_appeared date to its most recent appearance.

class core.external_list.MembershipManager(custom_list, log=None)[source]

Bases: object

Manage the membership of a custom list based on some criteria.

property new_membership

Iterate over the new membership of the list.

Yield:

a sequence of Edition objects

update(update_time=None)[source]
class core.external_list.TitleFromExternalList(metadata, first_appearance, most_recent_appearance, annotation)[source]

Bases: object

This class helps you convert data from external lists into Simplified Edition and CustomListEntry objects.

to_custom_list_entry(custom_list, metadata_client, overwrite_old_data=False)[source]

Turn this object into a CustomListEntry with associated Edition.

to_edition(_db, metadata_client, overwrite_old_data=False)[source]

Create or update an Edition object for this list item.

We have two goals here:

1. Make sure there is an Edition representing the list’s view of the data.

2. If at all possible, connect the Edition’s primary identifier to other identifiers in the system, identifiers which may have associated LicensePools. This can happen in two ways:

2a. The Edition’s primary identifier, or other identifiers associated with the Edition, may be directly associated with LicensePools. This can happen if a book’s list entry includes (e.g.) an Overdrive ID.

2b. The Edition’s permanent work ID may identify it as the same work as other Editions in the system. In that case this Edition’s primary identifier may be associated with the other Editions’ primary identifiers. (p=0.85)

core.facets module

class core.facets.FacetConfig(enabled_facets, default_facets, entrypoints=[])[source]

Bases: FacetConstants

A class that implements the facet-related methods of Library, and allows modifications to the enabled and default facets. For use when a controller needs to use a facet configuration different from the site-wide facets.

default_facet(group_name)[source]
enable_facet(group_name, facet)[source]
enabled_facets(group_name)[source]
classmethod from_library(library)[source]
set_default_facet(group_name, facet)[source]

Add facet to the list of possible values for group_name, even if the library does not have that facet configured.

class core.facets.FacetConstants[source]

Bases: object

AVAILABILITY_FACETS = ['now', 'all', 'always']
AVAILABILITY_FACET_GROUP_NAME = 'available'
AVAILABLE_ALL = 'all'
AVAILABLE_NOT_NOW = 'not_now'
AVAILABLE_NOW = 'now'
AVAILABLE_OPEN_ACCESS = 'always'
COLLECTION_FACETS = ['full', 'featured']
COLLECTION_FACET_GROUP_NAME = 'collection'
COLLECTION_FULL = 'full'
DEFAULT_ENABLED_FACETS = {'available': ['all', 'now', 'always'], 'collection': ['full', 'featured'], 'order': ['author', 'title', 'added']}
DEFAULT_FACET = {'available': 'all', 'collection': 'full', 'order': 'author'}
ENTRY_POINT_FACET_GROUP_NAME = 'entrypoint'
ENTRY_POINT_REL = 'http://librarysimplified.org/terms/rel/entrypoint'
FACETS_BY_GROUP = {'available': ['now', 'all', 'always'], 'collection': ['full', 'featured'], 'order': ['title', 'author', 'added', 'random', 'relevance']}
FACET_DISPLAY_TITLES = {'added': l'Recently Added', 'all': l'All', 'always': l'Yours to keep', 'author': l'Author', 'featured': l'Popular Books', 'full': l'Everything', 'last_update': l'Last Update', 'now': l'Available now', 'random': l'Random', 'relevance': l'Relevance', 'series': l'Series Position', 'title': l'Title', 'work_id': l'Work ID'}
GROUP_DESCRIPTIONS = {'available': l'Allow patrons to filter availability to', 'collection': l'Allow patrons to filter collection to', 'order': l'Allow patrons to sort by'}
GROUP_DISPLAY_TITLES = {'available': l'Availability', 'collection': l'Collection', 'order': l'Sort by'}
MAX_CACHE_AGE_NAME = 'max_age'
ORDER_ADDED_TO_COLLECTION = 'added'
ORDER_ASCENDING = 'asc'
ORDER_AUTHOR = 'author'
ORDER_DESCENDING = 'desc'
ORDER_DESCENDING_BY_DEFAULT = ['added', 'last_update']
ORDER_FACETS = ['title', 'author', 'added', 'random', 'relevance']
ORDER_FACET_GROUP_NAME = 'order'
ORDER_LAST_UPDATE = 'last_update'
ORDER_RANDOM = 'random'
ORDER_RELEVANCE = 'relevance'
ORDER_SERIES_POSITION = 'series'
ORDER_TITLE = 'title'
ORDER_WORK_ID = 'work_id'
SORT_ORDER_TO_ELASTICSEARCH_FIELD_NAME = {'added': 'licensepools.availability_time', 'author': 'sort_author', 'last_update': 'last_update_time', 'random': 'random', 'series': ['series_position', 'sort_title'], 'title': 'sort_title', 'work_id': '_id'}

core.lane module

class core.lane.BaseFacets[source]

Bases: FacetConstants

Basic faceting class that doesn’t modify a search filter at all.

This is intended solely for use as a base class.

CACHED_FEED_TYPE = None
property cached

This faceting object’s opinion on whether feeds should be cached.

Returns:

A boolean, or None for ‘no opinion’.

property facet_groups

Yield a list of 4-tuples (facet group, facet value, new Facets object, selected) for use in building OPDS facets.

This does not include the ‘entry point’ facet group, which must be handled separately.

items()[source]

Yields a 2-tuple for every active facet setting.

These tuples are used to generate URLs that can identify specific facet settings, and to distinguish between CachedFeed objects that represent the same feed with different facet settings.

max_cache_age = None
modify_database_query(_db, qu)[source]

If necessary, modify a database query so that resulting items conform the constraints of this faceting object.

The default behavior is to not modify the query.

modify_search_filter(filter)[source]

Modify an external_search.Filter object to filter out works excluded by the business logic of this faceting class.

property query_string

A query string fragment that propagates all active facet settings.

scoring_functions(filter)[source]

Create a list of ScoringFunction objects that modify how works in the given WorkList should be ordered.

Most subclasses will not use this because they order works using the ‘order’ feature.

classmethod selectable_entrypoints(worklist)[source]

Ignore all entry points, even if the WorkList supports them.

class core.lane.DatabaseBackedFacets(library, collection, availability, order, order_ascending=None, enabled_facets=None, entrypoint=None, entrypoint_is_default=False, **constructor_kwargs)[source]

Bases: Facets

A generic faceting object designed for managing queries against the database. (Other faceting objects are designed for managing Elasticsearch searches.)

ORDER_FACET_TO_DATABASE_FIELD = {'author': <sqlalchemy.orm.attributes.InstrumentedAttribute object>, 'last_update': <sqlalchemy.orm.attributes.InstrumentedAttribute object>, 'title': <sqlalchemy.orm.attributes.InstrumentedAttribute object>, 'work_id': <sqlalchemy.orm.attributes.InstrumentedAttribute object>}
classmethod available_facets(config, facet_group_name)[source]

Exclude search orders not available through database queries.

classmethod default_facet(config, facet_group_name)[source]

Exclude search orders not available through database queries.

modify_database_query(_db, qu)[source]

Restrict a query so that it matches only works that fit the criteria of this faceting object. Ensure query is appropriately ordered and made distinct.

order_by()[source]

Given these Facets, create a complete ORDER BY clause for queries against WorkModelWithGenre.

class core.lane.DatabaseBackedWorkList[source]

Bases: WorkList

A WorkList that can get its works from the database in addition to (or possibly instead of) the search index.

Even when works _are_ obtained through the search index, a DatabaseBackedWorkList is then created to look up the Work objects for use in an OPDS feed.

age_range_filter_clauses()[source]

Create a clause that filters out all books not classified as suitable for this DatabaseBackedWorkList’s age range.

audience_filter_clauses(_db, qu)[source]

Create a SQLAlchemy filter that excludes books whose intended audience doesn’t match what we’re looking for.

classmethod base_query(_db)[source]

Return a query that contains the joins set up as necessary to create OPDS feeds.

bibliographic_filter_clauses(_db, qu)[source]

Create a SQLAlchemy filter that excludes books whose bibliographic metadata doesn’t match what we’re looking for.

query is either qu, or a new query that has been modified to join against additional tables.

Returns:

A 2-tuple (query, clauses).

customlist_filter_clauses(qu)[source]

Create a filter clause that only books that are on one of the CustomLists allowed by Lane configuration.

Returns:

A 3-tuple (query, clauses).

query is the same query as qu, possibly extended with additional table joins.

clauses is a list of SQLAlchemy statements for use in a filter() or case() statement.

genre_filter_clause(qu)[source]
modify_database_query_hook(_db, qu)[source]

A hook method allowing subclasses to modify a database query that’s about to find all the works in this WorkList.

This can avoid the need for complex subclasses of DatabaseBackedFacets.

only_show_ready_deliverable_works(_db, query, show_suppressed=False)[source]

Restrict a query to show only presentation-ready works present in an appropriate collection which the default client can fulfill.

Note that this assumes the query has an active join against LicensePool.

works_from_database(_db, facets=None, pagination=None, **kwargs)[source]

Create a query against the works table that finds Work objects corresponding to all the Works that belong in this WorkList.

The apply_filters() implementation defines which Works qualify for membership in a WorkList of this type.

This tends to be slower than WorkList.works, but not all lanes can be generated through search engine queries.

Parameters:
  • _db – A database connection.

  • facets – A faceting object, which may place additional constraints on WorkList membership.

  • pagination – A Pagination object indicating which part of the WorkList the caller is looking at.

  • kwargs – Ignored – only included for compatibility with works().

Returns:

A Query.

class core.lane.DefaultSortOrderFacets(library, collection, availability, order, order_ascending=None, enabled_facets=None, entrypoint=None, entrypoint_is_default=False, **constructor_kwargs)[source]

Bases: Facets

A faceting object that changes the default sort order.

Subclasses must set DEFAULT_SORT_ORDER

classmethod available_facets(config, facet_group_name)[source]

Make sure the default sort order is the first item in the list of available sort orders.

classmethod default_facet(config, facet_group_name)[source]

The default value for the given facet group.

The default value must be one of the values returned by available_facets() above.

class core.lane.Facets(library, collection, availability, order, order_ascending=None, enabled_facets=None, entrypoint=None, entrypoint_is_default=False, **constructor_kwargs)[source]

Bases: FacetsWithEntryPoint

A full-fledged facet class that supports complex navigation between multiple facet groups.

Despite the generic name, this is only used in ‘page’ type OPDS feeds that list all the works in some WorkList.

ORDER_BY_RELEVANCE = 'relevance'
classmethod available_facets(config, facet_group_name)[source]

Which facets are enabled for the given facet group?

You can override this to forcible enable or disable facets that might not be enabled in library configuration, but you can’t make up totally new facets.

TODO: This sytem would make more sense if you _could_ make up totally new facets, maybe because each facet was represented as a policy object rather than a key to code implemented elsewhere in this class. Right now this method implies more flexibility than actually exists.

classmethod default(library, collection=None, availability=None, order=None, entrypoint=None)[source]
classmethod default_facet(config, facet_group_name)[source]

The default value for the given facet group.

The default value must be one of the values returned by available_facets() above.

property enabled_facets

Yield a 3-tuple of lists (order, availability, collection) representing facet values enabled via initialization or configuration

The ‘entry point’ facet group is handled separately, since it is not always used.

property facet_groups

Yield a list of 4-tuples (facet group, facet value, new Facets object, selected) for use in building OPDS facets.

This does not yield anything for the ‘entry point’ facet group, which must be handled separately.

classmethod from_request(library, config, get_argument, get_header, worklist, default_entrypoint=None, **extra)[source]

Load a faceting object from an HTTP request.

items()[source]

Yields a 2-tuple for every active facet setting.

In this class that just means the entrypoint and any max_cache_age.

modify_database_query(_db, qu)[source]

Restrict a query against Work+LicensePool+Edition so that it matches only works that fit the criteria of this Faceting object.

Sort order facet cannot be handled in this method, but can be handled in subclasses that override this method.

modify_search_filter(filter)[source]

Modify the given external_search.Filter object so that it reflects the settings of this Facets object.

This is the Elasticsearch equivalent of apply(). However, the Elasticsearch implementation of (e.g.) the meaning of the different availabilty statuses is kept in Filter.build().

navigate(collection=None, availability=None, order=None, entrypoint=None)[source]

Create a slightly different Facets object from this one.

class core.lane.FacetsWithEntryPoint(entrypoint=None, entrypoint_is_default=False, max_cache_age=None, **kwargs)[source]

Bases: BaseFacets

Basic Facets class that knows how to filter a query based on a selected EntryPoint.

classmethod from_request(library, facet_config, get_argument, get_header, worklist, default_entrypoint=None, **extra_kwargs)[source]

Load a faceting object from an HTTP request.

Parameters:
  • facet_config – A Library (or mock of one) that knows which subset of the available facets are configured.

  • get_argument – A callable that takes one argument and retrieves (or pretends to retrieve) a query string parameter of that name from an incoming HTTP request.

  • get_header – A callable that takes one argument and retrieves (or pretends to retrieve) an HTTP header of that name from an incoming HTTP request.

  • worklist – A WorkList associated with the current request, if any.

  • default_entrypoint – Select this EntryPoint if the incoming request does not specify an enabled EntryPoint. If this is None, the first enabled EntryPoint will be used as the default.

  • extra_kwargs – A dictionary of keyword arguments to pass into the constructor when a faceting object is instantiated.

Returns:

A FacetsWithEntryPoint, or a ProblemDetail if there’s a problem with the input from the request.

items()[source]

Yields a 2-tuple for every active facet setting.

In this class that just means the entrypoint and any max_cache_age.

classmethod load_entrypoint(name, valid_entrypoints, default=None)[source]

Look up an EntryPoint by name, assuming it’s valid in the given WorkList.

Parameters:
  • valid_entrypoints – The EntryPoints that might be valid. This is probably not the value of WorkList.selectable_entrypoints, because an EntryPoint selected in a WorkList remains valid (but not selectable) for all of its children.

  • default – A class to use as the default EntryPoint if none is specified. If no default is specified, the first enabled EntryPoint will be used.

Returns:

A 2-tuple (EntryPoint class, is_default).

classmethod load_max_cache_age(value)[source]

Convert a value for the MAX_CACHE_AGE_NAME parameter to a value that CachedFeed will understand.

Parameters:

value – A string.

Returns:

For now, either CachedFeed.IGNORE_CACHE or None.

modify_database_query(_db, qu)[source]

Modify the given database query so that it reflects this set of facets.

modify_search_filter(filter)[source]

Modify the given external_search.Filter object so that it reflects this set of facets.

navigate(entrypoint)[source]

Create a very similar FacetsWithEntryPoint that points to a different EntryPoint.

classmethod selectable_entrypoints(worklist)[source]

Which EntryPoints can be selected for these facets on this WorkList?

In most cases, there are no selectable EntryPoints; this generally happens only at the top level.

By default, this is completely determined by the WorkList. See SearchFacets for an example that changes this.

class core.lane.FeaturedFacets(minimum_featured_quality, entrypoint=None, random_seed=None, **kwargs)[source]

Bases: FacetsWithEntryPoint

A simple faceting object that configures a query so that the ‘most featurable’ items are at the front.

This is mainly a convenient thing to pass into AcquisitionFeed.groups().

CACHED_FEED_TYPE = 'groups'
classmethod default(lane, **kwargs)[source]
modify_search_filter(filter)[source]

Modify the given external_search.Filter object so that it reflects this set of facets.

navigate(minimum_featured_quality=None, entrypoint=None)[source]

Create a slightly different FeaturedFacets object based on this one.

scoring_functions(filter)[source]

Generate scoring functions that weight works randomly, but with ‘more featurable’ works tending to be at the top.

class core.lane.HierarchyWorkList[source]

Bases: WorkList

A WorkList representing part of a hierarchical view of a a library’s collection. (As opposed to a non-hierarchical view such as search results or “books by author X”.)

accessible_to(patron)[source]

As a matter of library policy, is the given Patron allowed to access this WorkList?

Most of the logic is inherited from WorkList, but there’s also a restriction based on the site hierarchy.

Parameters:

patron – A Patron

Returns:

A boolean

class core.lane.Lane(**kwargs)[source]

Bases: Base, DatabaseBackedWorkList, HierarchyWorkList

A WorkList that draws its search criteria from a row in a database table.

A Lane corresponds roughly to a section in a branch library or bookstore. Lanes are the primary means by which patrons discover books.

MAX_CACHE_AGE = 1200
add_genre(genre, inclusive=True, recursive=True)[source]

Create a new LaneGenre for the given genre and associate it with this Lane.

Mainly used in tests.

classmethod affected_by_customlist(customlist)[source]

Find all Lanes whose membership is partially derived from the membership of the given CustomList.

audiences
cachedfeeds
cachedmarcfiles
property children
property collection_ids
property customlist_ids

Find the database ID of every CustomList such that a Work filed in that List should be in this Lane.

Returns:

A list of CustomList IDs, possibly empty.

customlists
property depth

How deep is this lane in this site’s hierarchy? i.e. how many times do we have to follow .parent before we get None?

display_name
property entrypoints

Lanes cannot currently have EntryPoints.

explain()[source]

Create a series of human-readable strings to explain a lane’s settings.

fiction
property genre_ids

Find the database ID of every Genre such that a Work classified in that Genre should be in this Lane.

Returns:

A list of genre IDs, or None if this Lane does not consider genres at all.

genres = ObjectAssociationProxyInstance(AssociationProxy('lane_genres', 'genre'))
get_library(_db)[source]

For compatibility with WorkList.get_library().

groups(_db, include_sublanes=True, pagination=None, facets=None, search_engine=None, debug=False)[source]

Return a list of (Work, Lane) 2-tuples describing a sequence of featured items for this lane and (optionally) its children.

Parameters:
  • pagination – A Pagination object which may affect how many works each child of this WorkList may contribute.

  • facets – A FeaturedFacets object.

id
include_self_in_grouped_feed
inherit_parent_restrictions
is_self_or_descendant(ancestor)[source]

Is this WorkList the given WorkList or one of its descendants?

Parameters:

ancestor – A WorkList.

Returns:

A boolean.

lane_genres
languages
library_id
license_datasource_id
list_datasource
property list_datasource_id
list_seen_in_previous_days
max_cache_age(type)[source]

Determine how long a feed for this WorkList should be cached internally.

Parameters:

type – The type of feed.

media
parent
parent_id
property parentage

Yield the parent, grandparent, etc. of this Lane.

The Lane may be inside one or more non-Lane WorkLists, but those WorkLists are not counted in the parentage.

priority
root_for_patron_type
search(_db, query_string, search_client, pagination=None, facets=None)[source]

Find works in this lane that also match a search query.

Parameters:
  • _db – A database connection.

  • query_string – Search for this string.

  • search_client – An ExternalSearchIndex object.

  • pagination – A Pagination object.

  • facets – A faceting object, probably a SearchFacets.

property search_target

Obtain the WorkList that should be searched when someone initiates a search from this Lane.

size
size_by_entrypoint
sublanes
target_age
update_size(_db, search_engine=None)[source]

Update the stored estimate of the number of Works in this Lane.

property url_name

Return the name of this lane to be used in URLs.

Since most aspects of the lane can change through administrative action, we use the internal database ID of the lane in URLs.

property uses_customlists

Does the works() implementation for this Lane look for works on CustomLists?

visible
property visible_children

A WorkList’s children can be used to create a grouped acquisition feed for that WorkList.

class core.lane.LaneGenre(**kwargs)[source]

Bases: Base

Relationship object between Lane and Genre.

classmethod from_genre(genre)[source]

Used in the Lane.genres association proxy.

genre
genre_id
id
inclusive
lane
lane_id
recursive
class core.lane.Pagination(offset=0, size=50)[source]

Bases: object

DEFAULT_CRAWLABLE_SIZE = 100
DEFAULT_SEARCH_SIZE = 10
DEFAULT_SIZE = 50
MAX_SIZE = 100
classmethod default()[source]
property first_page
classmethod from_request(get_arg, default_size=None)[source]

Instantiate a Pagination object from a Flask request.

property has_next_page

Returns boolean reporting whether pagination is done for a query

Either total_size or this_page_size must be set for this method to be accurate.

items()[source]
modify_database_query(_db, qu)[source]

Modify the given database query with OFFSET and LIMIT.

modify_search_query(search)[source]

Modify a Search object so that it retrieves only a single ‘page’ of results.

Returns:

A Search object.

property next_page
page_loaded(page)[source]

An actual page of results has been fetched. Keep any internal state that would be useful to know when reasoning about earlier or later pages.

property previous_page
property query_string
classmethod size_from_request(get_arg, default)[source]
class core.lane.SearchFacets(**kwargs)[source]

Bases: Facets

A Facets object designed to filter search results.

Most search result filtering is handled by WorkList, but this allows someone to, e.g., search a multi-lingual WorkList in their preferred language.

DEFAULT_MIN_SCORE = 500
classmethod default_facet(ignore, group_name)[source]

The default facet settings for SearchFacets are hard-coded.

By default, we will search the full collection and all availabilities, and order by match quality rather than any bibliographic field.

classmethod from_request(library, config, get_argument, get_header, worklist, default_entrypoint=<class 'core.entrypoint.EverythingEntryPoint'>, **extra)[source]

Load a faceting object from an HTTP request.

items()[source]

Yields a 2-tuple for every active facet setting.

This means the EntryPoint (handled by the superclass) as well as possible settings for ‘media’ and “min_score”.

modify_search_filter(filter)[source]

Modify the given external_search.Filter object so that it reflects this SearchFacets object.

navigate(**kwargs)[source]

Create a slightly different Facets object from this one.

classmethod selectable_entrypoints(worklist)[source]

If the WorkList has more than one facet, an ‘everything’ facet is added for search purposes.

class core.lane.SpecificWorkList(work_ids)[source]

Bases: DatabaseBackedWorkList

A WorkList that only finds specific works, identified by ID.

modify_database_query_hook(_db, qu)[source]

A hook method allowing subclasses to modify a database query that’s about to find all the works in this WorkList.

This can avoid the need for complex subclasses of DatabaseBackedFacets.

class core.lane.TopLevelWorkList[source]

Bases: HierarchyWorkList

A special WorkList representing the top-level view of a library’s collection.

class core.lane.WorkList[source]

Bases: object

An object that can obtain a list of Work objects for use in generating an OPDS feed.

By default, these Work objects come from a search index.

CACHED_FEED_TYPE = None
MAX_CACHE_AGE = 600
accessible_to(patron)[source]

As a matter of library policy, is the given Patron allowed to access this WorkList?

append_child(child)[source]

Add one child to the list of children in this WorkList.

This hook method can be overridden to modify the child’s configuration so as to make it fit with what the parent is offering.

property audience_key

Translates audiences list into url-safe string

property customlist_ids

Return the custom list IDs.

property display_name_for_all

The display name to use when referring to the set of all books in this WorkList, as opposed to the WorkList itself.

filter(_db, facets)[source]

Helper method to instantiate a Filter object for this WorkList.

Using this ensures that modify_search_filter_hook() is always called.

property full_identifier

A human-readable identifier for this WorkList that captures its position within the heirarchy.

get_customlists(_db)[source]

Get customlists associated with the Worklist.

get_library(_db)[source]

Find the Library object associated with this WorkList.

groups(_db, include_sublanes=True, pagination=None, facets=None, search_engine=None, debug=False)[source]

Extract a list of samples from each child of this WorkList. This can be used to create a grouped acquisition feed for the WorkList.

Parameters:
  • pagination – A Pagination object which may affect how many works each child of this WorkList may contribute.

  • facets – A FeaturedFacets object that may restrict the works on view.

  • search_engine – An ExternalSearchIndex to use when asking for the featured works in a given WorkList.

  • debug – A debug argument passed into search_engine when running the search.

Yield:

A sequence of (Work, WorkList) 2-tuples, with each WorkList representing the child WorkList in which the Work is found.

property has_visible_children
property hierarchy

The portion of the WorkList hierarchy that culminates in this WorkList.

property inherit_parent_restrictions

Since a WorkList has no parent, it cannot inherit any restrictions from its parent. This method is defined for compatibility with Lane.

inherited_value(k)[source]

Try to find this WorkList’s value for the given key (e.g. ‘fiction’ or ‘audiences’).

If it’s not set, try to inherit a value from the WorkList’s parent. This only works if this WorkList has a parent and is configured to inherit values from its parent.

Note that inheritance works differently for genre_ids and customlist_ids – use inherited_values() for that.

inherited_values(k)[source]

Find the values for the given key (e.g. ‘genre_ids’ or ‘customlist_ids’) imposed by this WorkList and its parentage.

This is for values like .genre_ids and .customlist_ids, where each member of the WorkList hierarchy can impose a restriction on query results, and the effects of the restrictions are additive.

initialize(library, display_name=None, genres=None, audiences=None, languages=None, media=None, customlists=None, list_datasource=None, list_seen_in_previous_days=None, children=None, priority=None, entrypoints=None, fiction=None, license_datasource=None, target_age=None)[source]

Initialize with basic data.

This is not a constructor, to avoid conflicts with Lane, an ORM object that subclasses this object but does not use this initialization code.

Parameters:
  • library – Only Works available in this Library will be included in lists.

  • display_name – Name to display for this WorkList in the user interface.

  • genres – Only Works classified under one of these Genres will be included in lists.

  • audiences – Only Works classified under one of these audiences will be included in lists.

  • languages – Only Works in one of these languages will be included in lists.

  • media – Only Works in one of these media will be included in lists.

  • fiction – Only Works with this fiction status will be included in lists.

  • target_age – Only Works targeted at readers in this age range will be included in lists.

  • license_datasource – Only Works with a LicensePool from this DataSource will be included in lists.

  • customlists – Only Works included on one of these CustomLists will be included in lists.

  • list_datasource – Only Works included on a CustomList associated with this DataSource will be included in lists. This overrides any specific CustomLists provided in customlists.

  • list_seen_in_previous_days – Only Works that were added to a matching CustomList within this number of days will be included in lists.

  • children – This WorkList has children, which are also WorkLists.

  • priority – A number indicating where this WorkList should show up in relation to its siblings when it is the child of some other WorkList.

  • entrypoints – A list of EntryPoint classes representing different ways of slicing up this WorkList.

is_self_or_descendant(ancestor)[source]

Is this WorkList the given WorkList or one of its descendants?

Parameters:

ancestor – A WorkList.

Returns:

A boolean.

property language_key

Return a string identifying the languages used in this WorkList. This will usually be in the form of ‘eng,spa’ (English and Spanish).

max_cache_age(type)[source]

Determine how long a feed for this WorkList should be cached internally.

modify_search_filter_hook(filter)[source]

A hook method allowing subclasses to modify a Filter object that’s about to find all the works in this WorkList.

This can avoid the need for complex subclasses of Facets.

overview_facets(_db, facets)[source]

Convert a generic FeaturedFacets to some other faceting object, suitable for showing an overview of this WorkList in a grouped feed.

property parent

A WorkList has no parent. This method is defined for compatibility with Lane.

property parentage

WorkLists have no parentage. This method is defined for compatibility with Lane.

search(_db, query, search_client, pagination=None, facets=None, debug=False)[source]

Find works in this WorkList that match a search query.

Parameters:
  • _db – A database connection.

  • query – Search for this string.

  • search_client – An ExternalSearchIndex object.

  • pagination – A Pagination object.

  • facets – A faceting object, probably a SearchFacets.

  • debug – Pass in True to see a summary of results returned from the search index.

property search_target

By default, a WorkList is searchable.

classmethod top_level_for_library(_db, library)[source]

Create a WorkList representing this library’s collection as a whole.

If no top-level visible lanes are configured, the WorkList will be configured to show every book in the collection.

If a single top-level Lane is configured, it will returned as the WorkList.

Otherwise, a WorkList containing the visible top-level lanes is returned.

property unique_key

A string key that uniquely describes this WorkList within its Library.

This is used when caching feeds for this WorkList. For Lanes, the lane_id is used instead.

property uses_customlists

Does the works() implementation for this WorkList look for works on CustomLists?

visible = True
property visible_children

A WorkList’s children can be used to create a grouped acquisition feed for that WorkList.

works(_db, facets=None, pagination=None, search_engine=None, debug=False, **kwargs)[source]

Use a search engine to obtain Work or Work-like objects that belong in this WorkList.

Compare DatabaseBackedWorkList.works_from_database, which uses a database query to obtain the same Work objects.

Parameters:
  • _db – A database connection.

  • facets – A Facets object which may put additional constraints on WorkList membership.

  • pagination – A Pagination object indicating which part of the WorkList the caller is looking at, and/or a limit on the number of works to fetch.

  • kwargs – Different implementations may fetch the list of works from different sources and may need different keyword arguments.

Returns:

A list of Work or Work-like objects, or a database query that generates such a list when executed.

works_for_hits(_db, hits, facets=None)[source]

Convert a list of search results into Work objects.

This works by calling works_for_resultsets() on a list containing a single list of search results.

Parameters:
  • _db – A database connection

  • hits – A list of Hit objects from ElasticSearch.

Returns:

A list of Work or (if the search results include script fields), WorkSearchResult objects.

works_for_resultsets(_db, resultsets, facets=None)[source]

Convert a list of lists of Hit objects into a list of lists of Work objects.

core.lane.configuration_relevant_lifecycle_event(mapper, connection, target)[source]
core.lane.configuration_relevant_update(mapper, connection, target)[source]

core.local_analytics_provider module

class core.local_analytics_provider.LocalAnalyticsProvider(integration, library=None)[source]

Bases: object

CARDINALITY = 1
DESCRIPTION = l'Store analytics events in the 'circulationevents' database table.'
LOCATION_SOURCE = 'location_source'
LOCATION_SOURCE_DISABLED = ''
LOCATION_SOURCE_NEIGHBORHOOD = 'neighborhood'
NAME = l'Local Analytics'
SETTINGS = [{'key': 'location_source', 'label': l'Geographic location of events', 'description': l'Local analytics events may have a geographic location associated with them. How should the location be determined?<p>Note: to use the patron's neighborhood as the event location, you must also tell your patron authentication mechanism how to <i>gather</i> a patron's neighborhood information.', 'default': '', 'type': 'select', 'options': [{'key': '', 'label': l'Disable this feature.'}, {'key': 'neighborhood', 'label': l'Use the patron's neighborhood as the event location.'}]}]
collect_event(library, license_pool, event_type, time, old_value=None, new_value=None, **kwargs)[source]
classmethod initialize(_db)[source]

Find or create a local analytics service.

core.local_analytics_provider.Provider

alias of LocalAnalyticsProvider

core.log module

class core.log.CloudwatchLogs[source]

Bases: Logger

CREATE_GROUP = 'create_group'
DEFAULT_CREATE_GROUP = 'TRUE'
DEFAULT_INTERVAL = 60
DEFAULT_REGION = 'us-west-2'
GROUP = 'group'
INTERVAL = 'interval'
NAME = 'AWS Cloudwatch Logs'
REGION = 'region'
REGIONS = [{'key': 'us-east-2', 'label': l'US East (Ohio)'}, {'key': 'us-east-1', 'label': l'US East (N. Virginia)'}, {'key': 'us-west-1', 'label': l'US West (N. California)'}, {'key': 'us-west-2', 'label': l'US West (Oregon)'}, {'key': 'ap-south-1', 'label': l'Asia Pacific (Mumbai)'}, {'key': 'ap-northeast-3', 'label': l'Asia Pacific (Osaka-Local)'}, {'key': 'ap-northeast-2', 'label': l'Asia Pacific (Seoul)'}, {'key': 'ap-southeast-1', 'label': l'Asia Pacific (Singapore)'}, {'key': 'ap-southeast-2', 'label': l'Asia Pacific (Sydney)'}, {'key': 'ap-northeast-1', 'label': l'Asia Pacific (Tokyo)'}, {'key': 'ca-central-1', 'label': l'Canada (Central)'}, {'key': 'cn-north-1', 'label': l'China (Beijing)'}, {'key': 'cn-northwest-1', 'label': l'China (Ningxia)'}, {'key': 'eu-central-1', 'label': l'EU (Frankfurt)'}, {'key': 'eu-west-1', 'label': l'EU (Ireland)'}, {'key': 'eu-west-2', 'label': l'EU (London)'}, {'key': 'eu-west-3', 'label': l'EU (Paris)'}, {'key': 'sa-east-1', 'label': l'South America (Sao Paulo)'}]
SETTINGS = [{'key': 'group', 'label': l'Log Group', 'default': 'simplified', 'required': True}, {'key': 'stream', 'label': l'Log Stream', 'default': 'simplified', 'required': True}, {'key': 'interval', 'label': l'Update Interval Seconds', 'default': 60, 'required': True}, {'key': 'region', 'label': l'AWS Region', 'type': 'select', 'options': [{'key': 'us-east-2', 'label': l'US East (Ohio)'}, {'key': 'us-east-1', 'label': l'US East (N. Virginia)'}, {'key': 'us-west-1', 'label': l'US West (N. California)'}, {'key': 'us-west-2', 'label': l'US West (Oregon)'}, {'key': 'ap-south-1', 'label': l'Asia Pacific (Mumbai)'}, {'key': 'ap-northeast-3', 'label': l'Asia Pacific (Osaka-Local)'}, {'key': 'ap-northeast-2', 'label': l'Asia Pacific (Seoul)'}, {'key': 'ap-southeast-1', 'label': l'Asia Pacific (Singapore)'}, {'key': 'ap-southeast-2', 'label': l'Asia Pacific (Sydney)'}, {'key': 'ap-northeast-1', 'label': l'Asia Pacific (Tokyo)'}, {'key': 'ca-central-1', 'label': l'Canada (Central)'}, {'key': 'cn-north-1', 'label': l'China (Beijing)'}, {'key': 'cn-northwest-1', 'label': l'China (Ningxia)'}, {'key': 'eu-central-1', 'label': l'EU (Frankfurt)'}, {'key': 'eu-west-1', 'label': l'EU (Ireland)'}, {'key': 'eu-west-2', 'label': l'EU (London)'}, {'key': 'eu-west-3', 'label': l'EU (Paris)'}, {'key': 'sa-east-1', 'label': l'South America (Sao Paulo)'}], 'default': 'us-west-2', 'required': True}, {'key': 'create_group', 'label': l'Automatically Create Log Group', 'type': 'select', 'options': [{'key': 'TRUE', 'label': l'Yes'}, {'key': 'FALSE', 'label': l'No'}], 'default': True, 'required': True}]
SITEWIDE = True
STREAM = 'stream'
classmethod from_configuration(_db, testing=False)[source]

Should be implemented in each logging class.

classmethod get_handler(settings, testing=False)[source]

Turn ExternalIntegration into a log handler.

class core.log.JSONFormatter(app_name)[source]

Bases: Formatter

format(record)[source]

Format the specified record as text.

The record’s attribute dictionary is used as the operand to a string formatting operation which yields the returned string. Before formatting the dictionary, a couple of preparatory steps are carried out. The message attribute of the record is computed using LogRecord.getMessage(). If the formatting string uses the time (as determined by a call to usesTime(), formatTime() is called to format the event time. If there is exception information, it is formatted using formatException() and appended to the message.

fqdn = 'fv-az1493-85.mbrrm1wwqy3eneuh0nkjogyspa.dx.internal.cloudapp.net'
hostname = 'fv-az1493-85.mbrrm1wwqy3eneuh0nkjogyspa.dx.internal.cloudapp.net'
class core.log.LogConfiguration[source]

Bases: object

Configures the active Python logging handlers based on logging configuration from the database.

DATABASE_LOG_LEVEL = 'database_log_level'
DEBUG = 'DEBUG'
DEFAULT_APP_NAME = 'simplified'
DEFAULT_DATABASE_LOG_LEVEL = 'WARN'
DEFAULT_LOG_LEVEL = 'INFO'
ERROR = 'ERROR'
INFO = 'INFO'
LOG_APP_NAME = 'log_app'
LOG_LEVEL = 'log_level'
LOG_LEVEL_UI = [{'key': 'DEBUG', 'value': l'Debug'}, {'key': 'INFO', 'value': l'Info'}, {'key': 'WARN', 'value': l'Warn'}, {'key': 'ERROR', 'value': l'Error'}]
SITEWIDE_SETTINGS = [{'key': 'log_level', 'label': l'Log Level', 'type': 'select', 'options': [{'key': 'DEBUG', 'value': l'Debug'}, {'key': 'INFO', 'value': l'Info'}, {'key': 'WARN', 'value': l'Warn'}, {'key': 'ERROR', 'value': l'Error'}], 'default': 'INFO'}, {'key': 'log_app', 'label': l'Log Application name', 'description': l'Log messages originating from this application will be tagged with this name. If you run multiple instances, giving each one a different application name will help you determine which instance is having problems.', 'default': 'simplified'}, {'key': 'database_log_level', 'label': l'Database Log Level', 'type': 'select', 'options': [{'key': 'DEBUG', 'value': l'Debug'}, {'key': 'INFO', 'value': l'Info'}, {'key': 'WARN', 'value': l'Warn'}, {'key': 'ERROR', 'value': l'Error'}], 'description': l'Database logs are extremely verbose, so unless you're diagnosing a database-related problem, it's a good idea to set a higher log level for database messages.', 'default': 'WARN'}]
WARN = 'WARN'
classmethod from_configuration(_db, testing=False)[source]

Return the logging policy as configured in the database.

Parameters:
  • _db – A database connection. If None, the default logging policy will be used.

  • testing – A boolean indicating whether a unit test is happening right now. If True, the database configuration will be ignored in favor of a known test-friendly policy. (It’s okay to pass in False during a test of this method.)

Returns:

A 3-tuple (internal_log_level, database_log_level, handlers). internal_log_level is the log level to be used for most log messages. database_log_level is the log level to be applied to the loggers for the database connector and other verbose third-party libraries. handlers is a list of Handler objects that will be associated with the top-level logger.

classmethod initialize(_db, testing=False)[source]

Make the logging handlers reflect the current logging rules as configured in the database.

Parameters:
  • _db – A database connection. If this is None, the default logging configuration will be used.

  • testing – True if unit tests are currently running; otherwise False.

class core.log.Logger[source]

Bases: object

Abstract base class for logging

DEFAULT_APP_NAME = 'simplified'
DEFAULT_MESSAGE_TEMPLATE = '%(asctime)s:%(name)s:%(levelname)s:%(filename)s:%(message)s'
JSON_LOG_FORMAT = 'json'
TEXT_LOG_FORMAT = 'text'
classmethod from_configuration(_db, testing=False)[source]

Should be implemented in each logging class.

classmethod set_formatter(handler, app_name=None, log_format=None, message_template=None)[source]

Tell the given handler to format its log messages in a certain way.

class core.log.Loggly[source]

Bases: Logger

DEFAULT_LOGGLY_URL = 'https://logs-01.loggly.com/inputs/%(token)s/tag/python/'
NAME = 'Loggly'
PASSWORD = 'password'
SETTINGS = [{'key': 'user', 'label': l'Username', 'required': True}, {'key': 'password', 'label': l'Password', 'required': True}, {'key': 'url', 'label': l'URL', 'required': True, 'format': 'url'}]
SITEWIDE = True
URL = 'url'
USER = 'user'
classmethod from_configuration(_db, testing=False)[source]

Should be implemented in each logging class.

classmethod loggly_handler(externalintegration)[source]

Turn a Loggly ExternalIntegration into a log handler.

classmethod set_formatter(handler, app_name)[source]

Tell the given handler to format its log messages in a certain way.

class core.log.StringFormatter(fmt=None, datefmt=None, style='%', validate=True, *, defaults=None)[source]

Bases: Formatter

Encode all output as a string.

format(record)[source]

Format the specified record as text.

The record’s attribute dictionary is used as the operand to a string formatting operation which yields the returned string. Before formatting the dictionary, a couple of preparatory steps are carried out. The message attribute of the record is computed using LogRecord.getMessage(). If the formatting string uses the time (as determined by a call to usesTime(), formatTime() is called to format the event time. If there is exception information, it is formatted using formatException() and appended to the message.

class core.log.SysLogger[source]

Bases: Logger

LOG_FORMAT = 'log_format'
LOG_MESSAGE_TEMPLATE = 'message_template'
NAME = 'sysLog'
SETTINGS = [{'key': 'log_format', 'label': l'Log Format', 'type': 'select', 'options': [{'key': 'json', 'label': l'json'}, {'key': 'text', 'label': l'text'}]}, {'key': 'message_template', 'label': l'template', 'default': '%(asctime)s:%(name)s:%(levelname)s:%(filename)s:%(message)s', 'required': True}]
SITEWIDE = True
classmethod from_configuration(_db, testing=False)[source]

Should be implemented in each logging class.

core.marc module

class core.marc.Annotator[source]

Bases: object

The Annotator knows how to add information about a Work to a MARC record.

AUDIENCE_TERMS = {'Adult': 'General', 'Adults Only': 'Adult', 'Children': 'Juvenile', 'Young Adult': 'Adolescent'}
FORMAT_TERMS = {('application/epub+zip', None): 'EPUB eBook', ('application/epub+zip', 'application/vnd.adobe.adept+xml'): 'Adobe EPUB eBook', ('application/pdf', None): 'PDF eBook', ('application/pdf', 'application/vnd.adobe.adept+xml'): 'Adobe PDF eBook'}
classmethod add_audience(record, work)[source]
classmethod add_contributors(record, edition)[source]

Create contributor fields for this edition.

TODO: Use canonical names from LoC.

classmethod add_control_fields(record, identifier, pool, edition)[source]
classmethod add_distributor(record, pool)[source]
classmethod add_ebooks_subject(record)[source]
classmethod add_formats(record, pool)[source]
classmethod add_isbn(record, identifier)[source]
classmethod add_marc_organization_code(record, marc_org)[source]
classmethod add_physical_description(record, edition)[source]
classmethod add_publisher(record, edition)[source]
classmethod add_series(record, edition)[source]
classmethod add_simplified_genres(record, work)[source]

Create subject fields for this work.

classmethod add_summary(record, work)[source]
classmethod add_system_details(record)[source]
classmethod add_title(record, edition)[source]
annotate_work_record(work, active_license_pool, edition, identifier, record, integration=None, updated=None)[source]

Add metadata from this work to a MARC record.

Work:

The Work whose record is being annotated.

Active_license_pool:

Of all the LicensePools associated with this Work, the client has expressed interest in this one.

Edition:

The Edition to use when associating bibliographic metadata with this entry.

Identifier:

Of all the Identifiers associated with this Work, the client has expressed interest in this one.

Parameters:

record – A MARCRecord object to be annotated.

classmethod leader(work)[source]
marc_cache_field = 'marc_record'
class core.marc.MARCExporter(_db, library, integration)[source]

Bases: object

Turn a work into a record for a MARC file.

DEFAULT_MIRROR_INTEGRATION = {'key': 'NO_MIRROR', 'label': l'None - Do not mirror MARC files'}
DEFAULT_UPDATE_FREQUENCY = 30
DESCRIPTION = l'Export metadata into MARC files that can be imported into an ILS manually.'
INCLUDE_SIMPLIFIED_GENRES = 'include_simplified_genres'
INCLUDE_SUMMARY = 'include_summary'
LIBRARY_SETTINGS = [{'key': 'marc_update_frequency', 'label': l'Update frequency (in days)', 'description': l'The circulation manager will wait this number of days between generating MARC files.', 'type': 'number', 'default': 30}, {'key': 'marc_organization_code', 'label': l'The MARC organization code for this library (003 field).', 'description': l'MARC organization codes are assigned by the Library of Congress.'}, {'key': 'marc_web_client_url', 'label': l'The base URL for the web catalog for this library, for the 856 field.', 'description': l'If using a library registry that provides a web catalog, this can be left blank.'}, {'key': 'include_summary', 'label': l'Include summaries in MARC records (520 field)', 'type': 'select', 'options': [{'key': 'false', 'label': l'Do not include summaries'}, {'key': 'true', 'label': l'Include summaries'}], 'default': 'false'}, {'key': 'include_simplified_genres', 'label': l'Include Library Simplified genres in MARC records (650 fields)', 'type': 'select', 'options': [{'key': 'false', 'label': l'Do not include Library Simplified genres'}, {'key': 'true', 'label': l'Include Library Simplified genres'}], 'default': 'false'}]
MARC_ORGANIZATION_CODE = 'marc_organization_code'
NAME = 'MARC Export'
NO_MIRROR_INTEGRATION = 'NO_MIRROR'
SETTING = {'description': l'Storage protocol to use for uploading generated MARC files. The service must already be configured under 'Storage Services'.', 'key': 'mirror_integration_id', 'label': l'MARC Mirror', 'options': [{'key': 'NO_MIRROR', 'label': l'None - Do not mirror MARC files'}], 'type': 'select'}
UPDATE_FREQUENCY = 'marc_update_frequency'
WEB_CLIENT_URL = 'marc_web_client_url'
classmethod create_record(work, annotator, force_create=False, integration=None)[source]

Build a complete MARC record for a given work.

classmethod from_config(library)[source]
classmethod get_storage_settings(_db)[source]
records(lane, annotator, mirror_integration, start_time=None, force_refresh=False, mirror=None, search_engine=None, query_batch_size=500, upload_batch_size=7500)[source]

Create and export a MARC file for the books in a lane.

Parameters:
  • lane – The Lane to export books from.

  • annotator – The Annotator to use when creating MARC records.

  • mirror_integration – The mirror integration to use for MARC files.

  • start_time – Only include records that were created or modified after this time.

  • force_refresh – Create new records even when cached records are available.

  • mirror – Optional mirror to use instead of loading one from configuration.

  • query_batch_size – Number of works to retrieve with a single Elasticsearch query.

  • upload_batch_size – Number of records to mirror at a time. This is different from query_batch_size because S3 enforces a minimum size of 5MB for all parts of a multipart upload except the last, but 5MB of records would be too many works for a single query.

class core.marc.MARCExporterFacets(start_time)[source]

Bases: BaseFacets

A faceting object used to configure the search engine so that it only works updated since a certain time.

modify_search_filter(filter)[source]

Modify an external_search.Filter object to filter out works excluded by the business logic of this faceting class.

core.metadata_layer module

An abstract way of representing incoming metadata and applying it to Identifiers and Editions.

This acts as an intermediary between the third-party integrations (which have this information in idiosyncratic formats) and the model. Doing a third-party integration should be as simple as putting the information into this format.

exception core.metadata_layer.CSVFormatError[source]

Bases: Error

class core.metadata_layer.CSVMetadataImporter(data_source_name, title_field='title', language_field='language', default_language='eng', medium_field='medium', default_medium='Book', series_field='series', publisher_field='publisher', imprint_field='imprint', issued_field='issued', published_field=['published', 'publication year'], identifier_fields={'Axis 360 ID': ('axis 360 id', 0.75), 'Bibliotheca ID': ('3m id', 0.75), 'ISBN': ('isbn', 0.75), 'Overdrive ID': ('overdrive id', 0.75)}, subject_fields={'age': ('schema:typicalAgeRange', 100.0), 'audience': ('schema:audience', 100.0), 'tags': ('tag', 100.0)}, sort_author_field='file author as', display_author_field=['author', 'display author as'])[source]

Bases: object

Turn a CSV file into a list of Metadata objects.

DEFAULT_IDENTIFIER_FIELD_NAMES = {'Axis 360 ID': ('axis 360 id', 0.75), 'Bibliotheca ID': ('3m id', 0.75), 'ISBN': ('isbn', 0.75), 'Overdrive ID': ('overdrive id', 0.75)}
DEFAULT_SUBJECT_FIELD_NAMES = {'age': ('schema:typicalAgeRange', 100.0), 'audience': ('schema:audience', 100.0), 'tags': ('tag', 100.0)}
IDENTIFIER_PRECEDENCE = ['Axis 360 ID', 'Overdrive ID', 'Bibliotheca ID', 'ISBN']
property identifier_field_names

All potential field names that would identify an identifier.

list_field(row, names)[source]

Parse a string into a list by splitting on commas.

log = <Logger CSV metadata importer (WARNING)>
row_to_metadata(row)[source]
to_metadata(dictreader)[source]

Turn the CSV file in dictreader into a sequence of Metadata.

Yield:

A sequence of Metadata objects.

class core.metadata_layer.CirculationData(data_source, primary_identifier, licenses_owned=None, licenses_available=None, licenses_reserved=None, patrons_in_hold_queue=None, formats=None, default_rights_uri=None, links=None, licenses=None, last_checked=None)[source]

Bases: MetaToModelUtility

Information about actual copies of a book that can be delivered to patrons.

As distinct from Metadata, which is a container for information about a book.

Basically,

Metadata : Edition :: CirculationData : Licensepool

apply(_db, collection, replace=None)[source]

Update the title with this CirculationData’s information.

Parameters:

collection – A Collection representing actual copies of this title. Availability information (e.g. number of copies) will be associated with a LicensePool in this Collection. If this is not present, only delivery information (e.g. format information and open-access downloads) will be processed.

data_source(_db)[source]

Find the DataSource associated with this circulation information.

Does this Circulation object have an associated open-access link?

license_pool(_db, collection, analytics=None)[source]

Find or create a LicensePool object for this CirculationData.

Parameters:
  • collection – The LicensePool object will be associated with the given Collection.

  • analytics – If the LicensePool is newly created, the event will be tracked with this.

log = <Logger Abstract metadata layer - Circulation data (WARNING)>
primary_identifier(_db)[source]

Find the Identifier associated with this circulation information.

set_default_rights_uri(data_source_name, default_rights_uri=None)[source]
class core.metadata_layer.ContributorData(sort_name=None, display_name=None, family_name=None, wikipedia_name=None, roles=None, lc=None, viaf=None, biography=None, aliases=None, extra=None)[source]

Bases: object

apply(destination, replace=None)[source]

Update the passed-in Contributor-type object with this ContributorData’s information.

Param:

destination – the Contributor or ContributorData object to write this ContributorData object’s metadata to.

Param:

replace – Replacement policy (not currently used).

Returns:

the possibly changed Contributor object and a flag of whether it’s been changed.

classmethod display_name_to_sort_name_from_existing_contributor(_db, display_name)[source]

Find the sort name for this book’s author, assuming it’s easy.

‘Easy’ means we already have an established sort name for a Contributor with this exact display name.

If it’s not easy, this will be taken care of later with a call to the metadata wrangler’s author canonicalization service.

If we have a copy of this book in our collection (the only time an external list item is relevant), this will probably be easy.

display_name_to_sort_name_through_canonicalizer(_db, identifiers, metadata_client)[source]
find_sort_name(_db, identifiers, metadata_client)[source]

Try as hard as possible to find this person’s sort name.

classmethod from_contribution(contribution)[source]

Create a ContributorData object from a data-model Contribution object.

classmethod lookup(_db, sort_name=None, display_name=None, lc=None, viaf=None)[source]

Create a (potentially synthetic) ContributorData based on the best available information in the database.

Returns:

A ContributorData.

class core.metadata_layer.FormatData(content_type, drm_scheme, link=None, rights_uri=None)[source]

Bases: object

class core.metadata_layer.IdentifierData(type, identifier, weight=1)[source]

Bases: object

load(_db)[source]
class core.metadata_layer.LicenseData(identifier, checkout_url, status_url, expires=None, remaining_checkouts=None, concurrent_checkouts=None)[source]

Bases: object

class core.metadata_layer.LinkData(rel, href=None, media_type=None, content=None, thumbnail=None, rights_uri=None, rights_explanation=None, original=None, transformation_settings=None)[source]

Bases: object

property guessed_media_type

If the media type of a link is unknown, take a guess.

mirror_type()[source]

Returns the type of mirror that should be used for the link.

class core.metadata_layer.MARCExtractor[source]

Bases: object

Transform a MARC file into a list of Metadata objects.

This is not totally general, but it’s a good start.

END_OF_AUTHOR_NAME_RES = [re.compile(',\\s+[0-9]+-'), re.compile(',\\s+active '), re.compile(',\\s+graf,'), re.compile(',\\s+author.')]
classmethod name_cleanup(name)[source]
classmethod parse(file, data_source_name, default_medium_type=None)[source]
classmethod parse_year(value)[source]

Handle a publication year that may not be in the right format.

class core.metadata_layer.MeasurementData(quantity_measured, value, weight=1, taken_at=None)[source]

Bases: object

class core.metadata_layer.MetaToModelUtility[source]

Bases: object

Contains functionality common to both CirculationData and Metadata.

log = <Logger Abstract metadata layer - mirror code (WARNING)>

Retrieve a copy of the given link and make sure it gets mirrored. If it’s a full-size image, create a thumbnail and mirror that too.

The model_object can be either a pool or an edition.

class core.metadata_layer.Metadata(data_source, title=None, subtitle=None, sort_title=None, language=None, medium=None, series=None, series_position=None, publisher=None, imprint=None, issued=None, published=None, primary_identifier=None, identifiers=None, recommendations=None, subjects=None, contributors=None, measurements=None, links=None, data_source_last_updated=None, circulation=None, **kwargs)[source]

Bases: MetaToModelUtility

A (potentially partial) set of metadata for a published work.

BASIC_EDITION_FIELDS = ['title', 'sort_title', 'subtitle', 'language', 'medium', 'series', 'series_position', 'publisher', 'imprint', 'issued', 'published']
REL_REQUIRES_FULL_RECALCULATION = ['http://schema.org/description']
REL_REQUIRES_NEW_PRESENTATION_EDITION = ['http://opds-spec.org/image', 'http://opds-spec.org/image/thumbnail']
apply(edition, collection, metadata_client=None, replace=None, replace_identifiers=False, replace_subjects=False, replace_contributions=False, replace_links=False, replace_formats=False, replace_rights=False, force=False)[source]

Apply this metadata to the given edition.

Returns:

(edition, made_core_changes), where edition is the newly-updated object, and made_core_changes answers the question: were any edition core fields harmed in the making of this update? So, if title changed, return True. New: If contributors changed, this is now considered a core change, so work.simple_opds_feed refresh can be triggered.

associate_with_identifiers_based_on_permanent_work_id(_db)[source]

Try to associate this object’s primary identifier with the primary identifiers of Editions in the database which share a permanent work ID.

calculate_permanent_work_id(_db, metadata_client)[source]

Try to calculate a permanent work ID from this metadata.

This may require asking a metadata wrangler to turn a display name into a sort name–thus the metadata_client argument.

consolidate_identifiers()[source]
data_source(_db)[source]
edition(_db, create_if_not_exists=True)[source]

Find or create the edition described by this Metadata object.

filter_recommendations(_db)[source]

Filters out recommended identifiers that don’t exist in the db. Any IdentifierData objects will be replaced with Identifiers.

classmethod from_edition(edition)[source]

Create a basic Metadata object for the given Edition.

This doesn’t contain everything but it contains enough information to run guess_license_pools.

guess_license_pools(_db, metadata_client)[source]

Try to find existing license pools for this Metadata.

log = <Logger Abstract metadata layer (WARNING)>
make_thumbnail(data_source, link, link_obj)[source]

Make sure a Hyperlink representing an image is connected to its thumbnail.

normalize_contributors(metadata_client)[source]

Make sure that all contributors without a .sort_name get one.

property primary_author
update(metadata)[source]

Update this Metadata object with values from the given Metadata object.

TODO: We might want to take a policy object as an argument.

update_contributions(_db, edition, metadata_client=None, replace=True)[source]
class core.metadata_layer.ReplacementPolicy(identifiers=False, subjects=False, contributions=False, links=False, formats=False, rights=False, link_content=False, mirrors=None, content_modifier=None, analytics=None, http_get=None, even_if_not_apparently_updated=False, presentation_calculation_policy=None)[source]

Bases: object

How serious should we be about overwriting old metadata with this new metadata?

classmethod append_only(**args)[source]

Don’t overwrite any information, just append it.

This should probably never be used.

classmethod from_license_source(_db, **args)[source]

When gathering data from the license source, overwrite all old data from this source with new data from the same source. Also overwrite an old rights status with an updated status and update the list of available formats. Log availability changes to the configured analytics services.

classmethod from_metadata_source(**args)[source]

When gathering data from a metadata source, overwrite all old data from this source, but do not overwrite the rights status or the available formats. License sources are the authority on rights and formats, and metadata sources have no say in the matter.

class core.metadata_layer.SubjectData(type, identifier, name=None, weight=1)[source]

Bases: object

property key
class core.metadata_layer.TimestampData(start=None, finish=None, achievements=None, counter=None, exception=None)[source]

Bases: object

CLEAR_VALUE = <object object>
apply(_db)[source]
collection(_db)[source]
finalize(service, service_type, collection, start=None, finish=None, achievements=None, counter=None, exception=None)[source]

Finalize any values that were not set during the constructor.

This is intended to be run by the code that originally ran the service.

The given values for start, finish, achievements, counter, and exception will be used only if the service did not specify its own values for those fields.

property is_complete

Does this TimestampData represent an operation that has completed?

An operation is completed if it has failed, or if the time of its completion is known.

property is_failure

Does this TimestampData represent an unrecoverable failure?

core.mirror module

class core.mirror.MirrorUploader(integration, host)[source]

Bases: object

Handles the job of uploading a representation’s content to a mirror that we control.

IMPLEMENTATION_REGISTRY = {'Amazon S3': <class 'core.s3.S3Uploader'>, 'LCP': <class 'api.lcp.mirror.LCPMirror'>, 'MinIO': <class 'core.s3.MinIOUploader'>}
STORAGE_GOAL = 'storage'
book_url(identifier, extension='.epub', open_access=True, data_source=None, title=None)[source]

The URL of the hosted EPUB file for the given identifier.

This does not upload anything to the URL, but it is expected that calling mirror() on a certain Representation object will make that representation end up at that URL.

cover_image_url(data_source, identifier, filename=None, scaled_size=None)[source]

The URL of the hosted cover image for the given identifier.

This does not upload anything to the URL, but it is expected that calling mirror() on a certain Representation object will make that representation end up at that URL.

do_upload(representation)[source]
classmethod for_collection(collection, purpose)[source]

Create a MirrorUploader for the given Collection.

Parameters:
  • collection – Use the mirror configuration for this Collection.

  • purpose – Use the purpose of the mirror configuration.

Returns:

A MirrorUploader, or None if the Collection has no mirror integration.

classmethod implementation(integration)[source]

Instantiate the appropriate implementation of MirrorUploader for the given ExternalIntegration.

classmethod integration_by_name(_db, storage_name=None)[source]

Find the ExternalIntegration for the mirror by storage name.

is_self_url(url)[source]

Determines whether the URL has the mirror’s host or a custom domain

Parameters:

url (string) – The URL

Returns:

Boolean value indicating whether the URL has the mirror’s host or a custom domain

Return type:

bool

classmethod mirror(_db, storage_name=None, integration=None)[source]

Create a MirrorUploader from an integration or storage name.

Parameters:
  • storage_name – The name of the storage integration.

  • integration – The external integration.

Returns:

A MirrorUploader.

Raise:

CannotLoadConfiguration if no integration with goal==STORAGE_GOAL is configured.

mirror_batch(representations)[source]

Mirror a batch of Representations at once.

mirror_one(representation, mirror_to, collection=None)[source]

Mirror a single Representation.

Parameters:
sign_url(url, expiration=None)[source]

Signs a URL and make it expirable

Parameters:
  • url (string) – URL

  • expiration (int) – (Optional) Time in seconds for the presigned URL to remain valid. Default value depends on a specific implementation

Returns:

Signed expirable link

Return type:

string

abstract split_url(url, unquote=True)[source]

Splits the URL into the components: container (bucket) and file path

Parameters:
  • url (string) – URL

  • unquote (bool) – Boolean value indicating whether it’s required to unquote URL elements

Returns:

Tuple (bucket, file path)

Return type:

Tuple[string, string]

core.mock_analytics_provider module

class core.mock_analytics_provider.MockAnalyticsProvider(integration=None, library=None)[source]

Bases: object

A mock analytics provider that keeps track of how many times it’s called.

collect_event(library, lp, event_type, time=None, **kwargs)[source]
core.mock_analytics_provider.Provider

alias of MockAnalyticsProvider

core.monitor module

class core.monitor.CachedFeedReaper(*args, **kwargs)[source]

Bases: ReaperMonitor

Removed cached feeds older than thirty days.

MAX_AGE = 30
MODEL_CLASS

alias of CachedFeed

TIMESTAMP_FIELD = 'timestamp'
class core.monitor.CirculationEventLocationScrubber(*args, **kwargs)[source]

Bases: ScrubberMonitor

Scrub location information from old CirculationEvents.

MAX_AGE = 365
MODEL_CLASS

alias of CirculationEvent

SCRUB_FIELD = 'location'
TIMESTAMP_FIELD = 'start'
class core.monitor.CollectionMonitor(_db, collection)[source]

Bases: Monitor

A Monitor that does something for all Collections that come from a certain provider.

This class is designed to be subclassed rather than instantiated directly. Subclasses must define SERVICE_NAME and PROTOCOL. Subclasses may define replacement values for DEFAULT_START_TIME and DEFAULT_COUNTER.

PROTOCOL = None
classmethod all(_db, collections=None, **constructor_kwargs)[source]

Yield a sequence of CollectionMonitor objects: one for every Collection associated with cls.PROTOCOL.

If collections is specified, then there must be a Monitor for each one and Monitors will be yielded in the same order that the collections are specified. Otherwise, Monitors will be yielded as follows…

Monitors that have no Timestamp will be yielded first. After that, Monitors with older values for Timestamp.start will be yielded before Monitors with newer values.

Parameters:
  • _db – Database session object.

  • collections (List[core.model.collection.Collection]) – An optional list of collections. If None, we’ll process all collections.

  • constructor_kwargs – These keyword arguments will be passed into the CollectionMonitor constructor.

class core.monitor.CollectionMonitorLogger(logger, extra)[source]

Bases: LoggerAdapter

Prefix log messages with a collection, if one is present.

process(msg, kwargs)[source]

Process the logging message and keyword arguments passed in to a logging call to insert contextual information. You can either manipulate the message itself, the keyword args or both. Return the message and kwargs modified (or not) to suit your needs.

Normally, you’ll only need to override this one method in a LoggerAdapter subclass for your specific needs.

class core.monitor.CollectionReaper(*args, **kwargs)[source]

Bases: ReaperMonitor

Remove collections that have been marked for deletion.

MODEL_CLASS

alias of Collection

delete(collection)[source]

Delete a Collection from the database.

Database deletion of a Collection might take a really long time, so we call a special method that will do the deletion incrementally and can pick up where it left off if there’s a failure.

property where_clause

A SQLAlchemy clause that identifies the database rows to be reaped.

exception core.monitor.CoverageProvidersFailed(failed_providers)[source]

Bases: Exception

We tried to run CoverageProviders on a Work’s identifier, but some of the providers failed.

class core.monitor.CredentialReaper(*args, **kwargs)[source]

Bases: ReaperMonitor

Remove Credentials that expired more than a day ago.

MAX_AGE = 1
MODEL_CLASS

alias of Credential

TIMESTAMP_FIELD = 'expires'
class core.monitor.CustomListEntrySweepMonitor(_db, collection=None, batch_size=None)[source]

Bases: SweepMonitor

A Monitor that does something to every CustomListEntry.

MODEL_CLASS

alias of CustomListEntry

scope_to_collection(qu, collection)[source]

Restrict the query to only find CustomListEntries whose Work is in the given Collection.

class core.monitor.CustomListEntryWorkUpdateMonitor(_db, collection=None, batch_size=None)[source]

Bases: CustomListEntrySweepMonitor

Set or reset the Work associated with each custom list entry.

DEFAULT_BATCH_SIZE = 100
SERVICE_NAME = 'Update Works for custom list entries'
process_item(item)[source]

Do the work that needs to be done for a given item.

class core.monitor.EditionSweepMonitor(_db, collection=None, batch_size=None)[source]

Bases: SweepMonitor

A Monitor that does something to every Edition.

MODEL_CLASS

alias of Edition

scope_to_collection(qu, collection)[source]

Restrict the query to only find Editions whose primary Identifier is licensed to the given Collection.

class core.monitor.IdentifierSweepMonitor(_db, collection=None, batch_size=None)[source]

Bases: SweepMonitor

A Monitor that does some work for every Identifier.

MODEL_CLASS

alias of Identifier

scope_to_collection(qu, collection)[source]

Only find Identifiers licensed through the given Collection.

class core.monitor.MakePresentationReadyMonitor(_db, coverage_providers, collection=None)[source]

Bases: NotPresentationReadyWorkSweepMonitor

A monitor that makes works presentation ready.

By default this works by passing the work’s active edition into ensure_coverage() for each of a list of CoverageProviders. If all the ensure_coverage() calls succeed, presentation of the work is calculated and the work is marked presentation ready.

SERVICE_NAME = 'Make Works Presentation Ready'
prepare(work)[source]

Try to make a single Work presentation-ready.

Raises:

CoverageProvidersFailed – If we can’t make a Work presentation-ready because one or more CoverageProviders failed.

process_item(work)[source]

Do the work necessary to make one Work presentation-ready, and handle exceptions.

run()[source]

Before doing anything, consolidate works.

class core.monitor.MeasurementReaper(*args, **kwargs)[source]

Bases: ReaperMonitor

Remove measurements that are not the most recent

MODEL_CLASS

alias of Measurement

run()[source]

Do all the work that has piled up since the last time the Monitor ran to completion.

run_once(*args, **kwargs)[source]

Do the actual work of the Monitor.

Parameters:

progress – A TimestampData representing the work done by the Monitor up to this point.

Returns:

A TimestampData representing how you want the Monitor’s entry in the timestamps table to look like from this point on. NOTE: Modifying the incoming progress and returning it is generally a bad idea, because the incoming progress is full of old data. Instead, return a new TimestampData containing data for only the fields you want to set.

property where_clause

A SQLAlchemy clause that identifies the database rows to be reaped.

class core.monitor.Monitor(_db, collection=None)[source]

Bases: object

A Monitor is responsible for running some piece of code on a regular basis. A Monitor has an associated Timestamp that tracks the last time it successfully ran; it may use this information on its next run to cover the intervening span of time.

A Monitor will run to completion and then stop. To repeatedly run a Monitor, you’ll need to repeatedly invoke it from some external source such as a cron job.

This class is designed to be subclassed rather than instantiated directly. Subclasses must define SERVICE_NAME. Subclasses may define replacement values for DEFAULT_START_TIME and DEFAULT_COUNTER.

Although any Monitor may be associated with a Collection, it’s most useful to subclass CollectionMonitor if you’re writing code that needs to be run on every Collection of a certain type.

DEFAULT_COUNTER = None
DEFAULT_START_TIME = datetime.timedelta(seconds=60)
NEVER = <object object>
ONE_MINUTE_AGO = datetime.timedelta(seconds=60)
ONE_YEAR_AGO = datetime.timedelta(days=365)
SERVICE_NAME = None
cleanup()[source]

Do any work that needs to be done at the end, once the main work has completed successfully.

property collection

Retrieve the Collection object associated with this Monitor.

property initial_start_time

The time that should be used as the ‘start time’ the first time this Monitor is run.

property log
run()[source]

Do all the work that has piled up since the last time the Monitor ran to completion.

run_once(progress)[source]

Do the actual work of the Monitor.

Parameters:

progress – A TimestampData representing the work done by the Monitor up to this point.

Returns:

A TimestampData representing how you want the Monitor’s entry in the timestamps table to look like from this point on. NOTE: Modifying the incoming progress and returning it is generally a bad idea, because the incoming progress is full of old data. Instead, return a new TimestampData containing data for only the fields you want to set.

timestamp()[source]

Find or create a Timestamp for this Monitor.

This does not use TimestampData because it relies on checking whether a Timestamp already exists in the database.

A new timestamp will have .finish set to None, since the first run is presumably in progress.

class core.monitor.NotPresentationReadyWorkSweepMonitor(_db, collection=None, batch_size=None)[source]

Bases: WorkSweepMonitor

A Monitor that does something to every Work that is not presentation-ready.

item_query()[source]

Find the items that need to be processed in the sweep.

Returns:

A query object.

class core.monitor.OPDSEntryCacheMonitor(_db, collection=None, batch_size=None)[source]

Bases: PresentationReadyWorkSweepMonitor

A Monitor that recalculates the OPDS entries for every presentation-ready Work.

This is different from the OPDSEntryWorkCoverageProvider, which only processes works that are missing a WorkCoverageRecord with the ‘generate-opds’ operation.

SERVICE_NAME = 'ODPS Entry Cache Monitor'
process_item(work)[source]

Do the work that needs to be done for a given item.

class core.monitor.PatronNeighborhoodScrubber(*args, **kwargs)[source]

Bases: ScrubberMonitor

Scrub cached neighborhood information from patrons who haven’t been seen in a while.

MAX_AGE = datetime.timedelta(seconds=43200)
MODEL_CLASS

alias of Patron

SCRUB_FIELD = 'cached_neighborhood'
TIMESTAMP_FIELD = 'last_external_sync'
class core.monitor.PatronRecordReaper(*args, **kwargs)[source]

Bases: ReaperMonitor

Remove patron records that expired more than 60 days ago

MAX_AGE = 60
MODEL_CLASS

alias of Patron

TIMESTAMP_FIELD = 'authorization_expires'
class core.monitor.PermanentWorkIDRefreshMonitor(_db, collection=None, batch_size=None)[source]

Bases: EditionSweepMonitor

A monitor that calculates or recalculates the permanent work ID for every edition.

SERVICE_NAME = 'Permanent work ID refresh'
process_item(edition)[source]

Do the work that needs to be done for a given item.

class core.monitor.PresentationReadyWorkSweepMonitor(_db, collection=None, batch_size=None)[source]

Bases: WorkSweepMonitor

A Monitor that does something to every presentation-ready Work.

item_query()[source]

Find the items that need to be processed in the sweep.

Returns:

A query object.

class core.monitor.ReaperMonitor(*args, **kwargs)[source]

Bases: Monitor

A Monitor that deletes database rows that have expired but have no other process to delete them.

A subclass of ReaperMonitor MUST define values for the following constants: * MODEL_CLASS - The model class this monitor is reaping, e.g. Credential. * TIMESTAMP_FIELD - Within the model class, the DateTime field to be used when deciding which rows to deleting, e.g. ‘expires’. The reaper will be more efficient if there’s an index on this field. * MAX_AGE - A datetime.timedelta or number of days representing the time that must pass before an item can be safely deleted.

A subclass of ReaperMonitor MAY define values for the following constants: * BATCH_SIZE - The number of rows to fetch for deletion in a single batch. The default is 1000.

If your model class has fields that might contain a lot of data and aren’t important to the reaping process, put their field names into a list called LARGE_FIELDS and the Reaper will avoid fetching that information, improving performance.

BATCH_SIZE = 1000
MAX_AGE = None
MODEL_CLASS = None
REGISTRY = [<class 'core.monitor.CachedFeedReaper'>, <class 'core.monitor.CredentialReaper'>, <class 'core.monitor.WorkReaper'>, <class 'core.monitor.CollectionReaper'>, <class 'core.monitor.MeasurementReaper'>, <class 'core.monitor.CirculationEventLocationScrubber'>, <class 'core.monitor.PatronNeighborhoodScrubber'>, <class 'api.monitor.LoanReaper'>, <class 'api.monitor.HoldReaper'>, <class 'api.monitor.IdlingAnnotationReaper'>]
TIMESTAMP_FIELD = None
property cutoff

Items with a timestamp earlier than this time will be reaped.

delete(row)[source]

Delete a row from the database.

CAUTION: If you override this method such that it doesn’t actually delete the database row, then run_once() may enter an infinite loop.

query()[source]
run_once(*args, **kwargs)[source]

Do the actual work of the Monitor.

Parameters:

progress – A TimestampData representing the work done by the Monitor up to this point.

Returns:

A TimestampData representing how you want the Monitor’s entry in the timestamps table to look like from this point on. NOTE: Modifying the incoming progress and returning it is generally a bad idea, because the incoming progress is full of old data. Instead, return a new TimestampData containing data for only the fields you want to set.

property timestamp_field
property where_clause

A SQLAlchemy clause that identifies the database rows to be reaped.

class core.monitor.ScrubberMonitor(*args, **kwargs)[source]

Bases: ReaperMonitor

Scrub information from the database.

Unlike the other ReaperMonitors, this class doesn’t delete rows from the database – it only clears out specific data fields.

In addition to the constants required for ReaperMonitor, a subclass of ScrubberMonitor MUST define a value for the following constant:

  • SCRUB_FIELD - The field whose value will be set to None when a row is scrubbed.

run_once(*args, **kwargs)[source]

Find all rows that need to be scrubbed, and scrub them.

property scrub_field

Find the SQLAlchemy representation of the model field to be scrubbed.

property where_clause

Find rows that are older than MAX_AGE _and_ which have a non-null SCRUB_FIELD. If the field is already null, there’s no need to scrub it.

class core.monitor.SubjectSweepMonitor(_db, subject_type=None, filter_string=None)[source]

Bases: SweepMonitor

A Monitor that does some work for every Subject.

DEFAULT_BATCH_SIZE = 500
MODEL_CLASS

alias of Subject

item_query()[source]

Find only Subjects that match the given filters.

scope_to_collection(qu, collection)[source]

Refuse to scope this query to a Collection.

class core.monitor.SweepMonitor(_db, collection=None, batch_size=None)[source]

Bases: CollectionMonitor

A monitor that does some work for every item in a database table, then stops.

Progress through the table is stored in the Timestamp, so that if the Monitor crashes, the next time the Monitor is run, it starts at the item that caused the crash, rather than starting from the beginning of the table.

COMPLETION_LOG_LEVEL = 20
DEFAULT_BATCH_SIZE = 100
DEFAULT_COUNTER = 0
MODEL_CLASS = None
fetch_batch(offset)[source]

Retrieve one batch of work from the database.

item_query()[source]

Find the items that need to be processed in the sweep.

Returns:

A query object.

process_batch(offset)[source]

Process one batch of work.

process_item(item)[source]

Do the work that needs to be done for a given item.

process_items(items)[source]

Process a list of items.

run_once(*ignore)[source]

Do the actual work of the Monitor.

Parameters:

progress – A TimestampData representing the work done by the Monitor up to this point.

Returns:

A TimestampData representing how you want the Monitor’s entry in the timestamps table to look like from this point on. NOTE: Modifying the incoming progress and returning it is generally a bad idea, because the incoming progress is full of old data. Instead, return a new TimestampData containing data for only the fields you want to set.

scope_to_collection(qu, collection)[source]

Restrict the given query so that it only finds items associated with the given collection.

Parameters:
  • qu – A query object.

  • collection – A Collection object, presumed to not be None.

class core.monitor.TimelineMonitor(_db, collection=None)[source]

Bases: Monitor

A monitor that needs to process everything that happened between two specific times.

This Monitor uses Timestamp.start and Timestamp.finish to describe the span of time covered in the most recent run, not the time it actually took to run.

OVERLAP = datetime.timedelta(seconds=300)
catch_up_from(start, cutoff, progress)[source]

Make sure all events between start and cutoff are covered.

Parameters:
  • start – Start looking for events that happened at this time.

  • cutoff – You’re not responsible for events that happened after this time.

  • progress – A TimestampData representing the progress so far. Unlike with run_once(), you are encouraged to can modify this in place, for instance to set .achievements. However, you cannot change .start and .finish – any changes will be overwritten by run_once().

run_once(progress)[source]

Do the actual work of the Monitor.

Parameters:

progress – A TimestampData representing the work done by the Monitor up to this point.

Returns:

A TimestampData representing how you want the Monitor’s entry in the timestamps table to look like from this point on. NOTE: Modifying the incoming progress and returning it is generally a bad idea, because the incoming progress is full of old data. Instead, return a new TimestampData containing data for only the fields you want to set.

classmethod slice_timespan(start, cutoff, increment)[source]

Slice a span of time into segments no large than [increment].

This lets you divide up a task like “gather the entire circulation history for a collection” into chunks of one day.

Parameters:
  • start – A datetime.

  • cutoff – A datetime.

  • increment – A timedelta.

class core.monitor.WorkReaper(*args, **kwargs)[source]

Bases: ReaperMonitor

Remove Works that have no associated LicensePools.

Unlike other reapers, no timestamp is relevant. As soon as a Work loses its last LicensePool it can be removed.

MODEL_CLASS

alias of Work

delete(work)[source]

Delete work from elasticsearch and database.

query()[source]
class core.monitor.WorkSweepMonitor(_db, collection=None, batch_size=None)[source]

Bases: SweepMonitor

A Monitor that does something to every Work.

MODEL_CLASS

alias of Work

scope_to_collection(qu, collection)[source]

Restrict the query to only find Works found in the given Collection.

core.opds module

class core.opds.AcquisitionFeed(_db, title, url, works, annotator=None, precomposed_entries=[])[source]

Bases: OPDSFeed

CURRENT_ENTRYPOINT_ATTRIBUTE = '{http://librarysimplified.org/terms/}entryPoint'
FACET_REL = 'http://opds-spec.org/facet'

Add information necessary to find your current place in the site’s navigation.

A link with rel=”start” points to the start of the site

A <simplified:entrypoint> section describes the current entry point.

A <simplified:breadcrumbs> section contains a sequence of breadcrumb links.

add_breadcrumbs(lane, include_lane=False, entrypoint=None)[source]

Add list of ancestor links in a breadcrumbs element.

Parameters:
  • lane – Add breadcrumbs from up to this lane.

  • include_lane – Include lane itself in the breadcrumbs.

  • entrypoint – The currently selected entrypoint, if any.

TODO: The switchover from “no entry point” to “entry point” needs its own breadcrumb link.

add_entry(work)[source]

Attempt to create an OPDS <entry>. If successful, append it to the feed.

Add links to a feed forming an OPDS facet group for a set of EntryPoints.

Parameters:
  • feed – A lxml Tag object.

  • url_generator – A callable that returns the entry point URL when passed an EntryPoint.

  • entrypoints – A list of all EntryPoints in the facet group.

  • selected_entrypoint – The current EntryPoint, if selected.

as_error_response(**kwargs)[source]

Convert this feed into an OPDSFEedResponse that should be treated by intermediaries as an error – that is, treated as private and not cached.

as_response(**kwargs)[source]

Convert this feed into an OPDSFEedResponse.

create_entry(work, even_if_no_license_pool=False, force_create=False, use_cache=True)[source]

Turn a work into an entry for an acquisition feed.

classmethod error_message(identifier, error_status, error_message)[source]

Turn an error result into an OPDSMessage suitable for adding to a feed.

Build a set of attributes for a facet link.

Parameters:
  • href – Destination of the link.

  • title – Human-readable description of the facet.

  • facet_group_name – The facet group to which the facet belongs, e.g. “Sort By”.

  • is_active – True if this is the client’s currently selected facet.

Returns:

A dictionary of attributes, suitable for passing as keyword arguments into OPDSFeed.add_link_to_feed.

Create links for this feed’s navigational facet groups.

This does not create links for the entry point facet group, because those links should only be present in certain circumstances, and this method doesn’t know if those circumstances apply. You need to decide whether to call add_entrypoint_links in addition to calling this method.

classmethod format_types(delivery_mechanism)[source]

Generate a set of types suitable for passing into acquisition_link().

classmethod from_query(query, _db, feed_name, url, pagination, url_fn, annotator)[source]

Build a feed representing one page of a given list. Currently used for creating an OPDS feed for a custom list and not cached.

TODO: This is used by the circulation manager admin interface. Investigate changing the code that uses this to use the search index – this is inefficient and creates an alternate code path that may harbor bugs.

TODO: This cannot currently return OPDSFeedResponse because the admin interface modifies the feed after it’s generated.

classmethod groups(_db, title, url, worklist, annotator, pagination=None, facets=None, max_age=None, search_engine=None, search_debug=False, **response_kwargs)[source]

The acquisition feed for ‘featured’ items from a given lane’s sublanes, organized into per-lane groups.

NOTE: If the lane has no sublanes, a grouped feed will probably be unsatisfying. Call page() instead with an appropriate Facets object.

Parameters:
  • pagination – A Pagination object. No single child of this lane will contain more than pagination.size items.

  • facets – A GroupsFacet object.

  • response_kwargs – Extra keyword arguments to pass into the OPDSFeedResponse constructor.

Returns:

An OPDSFeedResponse containing the feed.

classmethod indirect_acquisition(indirect_types)[source]
classmethod license_tags(license_pool, loan, hold, rel=None, library=None)[source]
classmethod minimal_opds_entry(identifier, cover, description, quality, most_recent_update=None)[source]
classmethod page(_db, title, url, worklist, annotator, facets=None, pagination=None, max_age=None, search_engine=None, search_debug=False, **response_kwargs)[source]

Create a feed representing one page of works from a given lane.

Parameters:

response_kwargs – Extra keyword arguments to pass into the OPDSFeedResponse constructor.

Returns:

An OPDSFeedResponse containing the feed.

classmethod search(_db, title, url, lane, search_engine, query, pagination=None, facets=None, annotator=None, **response_kwargs)[source]

Run a search against the given search engine and return the results as a Flask Response.

Parameters:
  • _db – A database connection

  • title – The title of the resulting OPDS feed.

  • url – The URL from which the feed will be served.

  • search_engine – An ExternalSearchIndex.

  • query – The search query

  • pagination – A Pagination

  • facets – A Facets

  • annotator – An Annotator

  • response_kwargs – Keyword arguments to pass into the OPDSFeedResponse constructor.

Returns:

An ODPSFeedResponse

show_current_entrypoint(entrypoint)[source]

Annotate this given feed with a simplified:entryPoint attribute pointing to the current entrypoint’s TYPE_URI.

This gives clients an overall picture of the type of works in the feed, and a way to distinguish between one EntryPoint and another.

Parameters:

entrypoint – An EntryPoint.

classmethod single_entry(_db, work, annotator, force_create=False, raw=False, use_cache=True, **response_kwargs)[source]

Create a single-entry OPDS document for one specific work.

Parameters:
  • _db – A database connection.

  • work – A Work

  • work – An Annotator

  • force_create – Create the OPDS entry from scratch even if there’s already a cached one.

  • raw – If this is False (the default), a Flask Response will be returned, ready to be sent over the network. Otherwise an object representing the underlying OPDS entry will be returned.

  • use_cache – Boolean value determining whether the OPDS cache shall be used.

  • response_kwargs – These keyword arguments will be passed into the Response constructor, if it is invoked.

Returns:

A Response, if raw is false. Otherwise, an OPDSMessage or an etree._Element – whatever was returned by OPDSFeed.create_entry.

class core.opds.Annotator[source]

Bases: object

The Annotator knows how to present an OPDS feed in a specific application context.

classmethod active_licensepool_for(work)[source]

Which license pool would be/has been used to issue a license for this work?

classmethod annotate_feed(feed, lane, list=None)[source]

Make any custom modifications necessary to integrate this OPDS feed into the application’s workflow.

annotate_work_entry(work, active_license_pool, edition, identifier, feed, entry, updated=None)[source]

Make any custom modifications necessary to integrate this OPDS entry into the application’s workflow.

Work:

The Work whose OPDS entry is being annotated.

Active_license_pool:

Of all the LicensePools associated with this Work, the client has expressed interest in this one.

Edition:

The Edition to use when associating bibliographic metadata with this entry. You will probably not need to use this, because bibliographic metadata was associated with the entry when it was created.

Identifier:

Of all the Identifiers associated with this Work, the client has expressed interest in this one.

Parameters:
  • feed – An OPDSFeed – the feed in which this entry will be situated.

  • entry – An lxml Element object, the entry that will be added to the feed.

classmethod authors(work, edition)[source]

Create one or more <author> and <contributor> tags for the given Work.

Parameters:
  • work – The Work under consideration.

  • edition – The Edition to use as a reference for bibliographic information, including the list of Contributions.

classmethod categories(work)[source]

Return all relevant classifications of this work.

Returns:

A dictionary mapping ‘scheme’ URLs to dictionaries of attribute-value pairs.

Notable attributes: ‘term’, ‘label’, ‘http://schema.org/ratingValue

classmethod content(work)[source]

Return an HTML summary of this work.

classmethod contributor_tag(contribution, state)[source]

Build an <author> or <contributor> tag for a Contribution.

Parameters:
  • contribution – A Contribution.

  • state – A defaultdict of sets, which may be used to keep track of what happened during previous calls to contributor_tag for a given Work.

Returns:

A Tag, or None if creating a Tag for this Contribution would be redundant or of low value.

Return all links to be used as cover links for this work.

In a distribution application, each work will have only one link. In a content server-type application, each work may have a large number of links.

Returns:

A 2-tuple (thumbnail_links, full_links)

classmethod default_lane_url()[source]
classmethod facet_url(facets, facet=None)[source]
classmethod featured_feed_url(lane, order=None, facets=None)[source]
classmethod feed_url(lane, facets=None, pagination=None)[source]
classmethod group_uri(work, license_pool, identifier)[source]

The URI to be associated with this Work when making it part of a grouped feed.

By default, this does nothing. See circulation/LibraryAnnotator for a subclass that does something.

Returns:

A 2-tuple (URI, title)

classmethod groups_url(lane, facets=None)[source]
is_work_entry_solo(work)[source]
Return a boolean value indicating whether the work’s OPDS catalog entry is served by itself,

rather than as a part of the feed.

Parameters:

work (core.model.work.Work) – Work object

Returns:

Boolean value indicating whether the work’s OPDS catalog entry is served by itself, rather than as a part of the feed

Return type:

bool

classmethod lane_id(lane)[source]
classmethod lane_url(lane, facets=None)[source]
classmethod navigation_url(lane)[source]
opds_cache_field = 'simple_opds_entry'

Generate a permanent link a client can follow for information about this entry, and only this entry.

Note that permalink is distinct from the Atom <id>, which is always the identifier’s URN.

Returns:

A 2-tuple (URL, media type). If a single value is returned, the media type will be presumed to be that of an OPDS entry.

classmethod rating_tag(type_uri, value)[source]

Generate a schema:Rating tag for the given type and value.

classmethod search_url(lane, query, pagination, facets=None)[source]
classmethod series(series_name, series_position)[source]

Generate a schema:Series tag for the given name and position.

sort_works_for_groups_feed(works, **kwargs)[source]
classmethod work_id(work)[source]
class core.opds.LookupAcquisitionFeed(_db, title, url, works, annotator=None, precomposed_entries=[])[source]

Bases: AcquisitionFeed

Used when the user has requested a lookup of a specific identifier, which may be different from the identifier used by the Work’s default LicensePool.

create_entry(work)[source]

Turn an Identifier and a Work into an entry for an acquisition feed.

class core.opds.NavigationFacets(minimum_featured_quality, entrypoint=None, random_seed=None, **kwargs)[source]

Bases: FeaturedFacets

CACHED_FEED_TYPE = 'navigation'
class core.opds.NavigationFeed(title, url)[source]

Bases: OPDSFeed

add_entry(url, title, type='application/atom+xml;profile=opds-catalog;kind=navigation')[source]

Create an OPDS navigation entry for a URL.

classmethod navigation(_db, title, url, worklist, annotator, facets=None, max_age=None, **response_kwargs)[source]

The navigation feed with links to a given lane’s sublanes.

Parameters:

response_kwargs – Extra keyword arguments to pass into the OPDSFeedResponse constructor.

Returns:

A Response

class core.opds.TestAnnotator[source]

Bases: Annotator

classmethod default_lane_url()[source]
classmethod facet_url(facets)[source]
classmethod feed_url(lane, facets=None, pagination=None)[source]
classmethod groups_url(lane, facets=None)[source]
classmethod lane_url(lane)[source]
classmethod navigation_url(lane)[source]
classmethod search_url(lane, query, pagination, facets=None)[source]
classmethod top_level_title()[source]
class core.opds.TestAnnotatorWithGroup[source]

Bases: TestAnnotator

group_uri(work, license_pool, identifier)[source]

The URI to be associated with this Work when making it part of a grouped feed.

By default, this does nothing. See circulation/LibraryAnnotator for a subclass that does something.

Returns:

A 2-tuple (URI, title)

group_uri_for_lane(lane)[source]
top_level_title()[source]
class core.opds.TestUnfulfillableAnnotator[source]

Bases: TestAnnotator

Raise an UnfulfillableWork exception when asked to annotate an entry.

annotate_work_entry(*args, **kwargs)[source]

Make any custom modifications necessary to integrate this OPDS entry into the application’s workflow.

Work:

The Work whose OPDS entry is being annotated.

Active_license_pool:

Of all the LicensePools associated with this Work, the client has expressed interest in this one.

Edition:

The Edition to use when associating bibliographic metadata with this entry. You will probably not need to use this, because bibliographic metadata was associated with the entry when it was created.

Identifier:

Of all the Identifiers associated with this Work, the client has expressed interest in this one.

Parameters:
  • feed – An OPDSFeed – the feed in which this entry will be situated.

  • entry – An lxml Element object, the entry that will be added to the feed.

exception core.opds.UnfulfillableWork[source]

Bases: Exception

Raise this exception when it turns out a Work currently cannot be fulfilled through any means, and this is a problem sufficient to cancel the creation of an <entry> for the Work.

For commercial works, this might be because the collection contains no licenses. For open-access works, it might be because none of the delivery mechanisms could be mirrored.

class core.opds.VerboseAnnotator[source]

Bases: Annotator

The default Annotator for machine-to-machine integration.

This Annotator describes all categories and authors for the book in great detail.

classmethod add_ratings(work, entry)[source]

Add a quality rating to the work.

annotate_work_entry(work, active_license_pool, edition, identifier, feed, entry)[source]

Make any custom modifications necessary to integrate this OPDS entry into the application’s workflow.

Work:

The Work whose OPDS entry is being annotated.

Active_license_pool:

Of all the LicensePools associated with this Work, the client has expressed interest in this one.

Edition:

The Edition to use when associating bibliographic metadata with this entry. You will probably not need to use this, because bibliographic metadata was associated with the entry when it was created.

Identifier:

Of all the Identifiers associated with this Work, the client has expressed interest in this one.

Parameters:
  • feed – An OPDSFeed – the feed in which this entry will be situated.

  • entry – An lxml Element object, the entry that will be added to the feed.

classmethod authors(work, edition)[source]

Create a detailed <author> tag for each author.

classmethod categories(work, policy=None)[source]

Send out _all_ categories for the work.

(So long as the category type has a URI associated with it in Subject.uri_lookup.)

Parameters:

policy – A PresentationCalculationPolicy to use when deciding how deep to go when finding equivalent identifiers for the work.

classmethod detailed_author(contributor)[source]

Turn a Contributor into a detailed <author> tag.

opds_cache_field = 'verbose_opds_entry'

core.opds2_import module

class core.opds2_import.OPDS2ImportMonitor(_db, collection, import_class, force_reimport=False, **import_class_kwargs)[source]

Bases: OPDSImportMonitor

MEDIA_TYPE = ('application/opds+json', 'application/json')
PROTOCOL = 'OPDS 2.0 Import'
class core.opds2_import.OPDS2Importer(db, collection, data_source_name=None, identifier_mapping=None, http_get=None, metadata_client=None, content_modifier=None, map_from_collection=None, mirrors=None)[source]

Bases: OPDSImporter

Imports editions and license pools from an OPDS 2.0 feed.

DESCRIPTION = l'Import books from a publicly-accessible OPDS 2.0 feed.'
NAME = 'OPDS 2.0 Import'
extract_feed_data(feed, feed_url=None)[source]

Turn an OPDS 2.0 feed into lists of Metadata and CirculationData objects.

Parameters:
  • feed (Union[str, opds2_ast.OPDS2Feed]) – OPDS 2.0 feed

  • feed_url (Optional[str]f) – Feed URL used to resolve relative links

extract_last_update_dates(feed)[source]

Extract last update date of the feed.

Parameters:

feed (Union[str, opds2_ast.OPDS2Feed]) – OPDS 2.0 feed

Returns:

A list of 2-tuples containing publication’s identifiers and their last modified dates

Return type:

List[Tuple[str, datetime.datetime]]

Extracts “next” links from the feed.

Parameters:

feed (Union[str, opds2_ast.OPDS2Feed]) – OPDS 2.0 feed

Returns:

List of “next” links

Return type:

List[str]

core.opds2_import.parse_feed(feed, silent=True)[source]

Parses the feed into OPDS2Feed object.

Parameters:
  • feed (Union[str, opds2_ast.OPDS2Feed]) – OPDS 2.0 feed

  • silent (bool) – Boolean value indicating whether to raise

Returns:

Parsed OPDS 2.0 feed

Return type:

opds2_ast.OPDS2Feed

core.opds_import module

exception core.opds_import.AccessNotAuthenticated[source]

Bases: Exception

No authentication is configured for this service

class core.opds_import.MetadataWranglerOPDSLookup(url, shared_secret=None, collection=None)[source]

Bases: SimplifiedOPDSLookup, HasSelfTests

ADD_ENDPOINT = 'add'
ADD_WITH_METADATA_ENDPOINT = 'add_with_metadata'
CANONICALIZE_ENDPOINT = 'canonical-author-name'
CARDINALITY = 1
METADATA_NEEDED_ENDPOINT = 'metadata_needed'
NAME = l'Library Simplified Metadata Wrangler'
PROTOCOL = 'Metadata Wrangler'
REMOVE_ENDPOINT = 'remove'
SETTINGS = [{'key': 'url', 'label': l'URL', 'default': 'http://metadata.librarysimplified.org/', 'required': True, 'format': 'url'}]
SITEWIDE = True
UPDATES_ENDPOINT = 'updates'
add(identifiers)[source]

Add items to an authenticated Metadata Wrangler Collection

add_args(url, arg_string)[source]
add_with_metadata(feed)[source]

Add a feed of items with metadata to an authenticated Metadata Wrangler Collection.

property authenticated
property authorization
canonicalize_author_name(identifier, working_display_name)[source]

Attempt to find the canonical name for the author of a book.

Parameters:
  • identifier – an ISBN-type Identifier.

  • working_display_name – The display name of the author (i.e. the name format human being used as opposed to the name that goes into library records).

classmethod external_integration(_db)[source]

Locate the ExternalIntegration associated with this object. The status of the self-tests will be stored as a ConfigurationSetting on this ExternalIntegration.

By default, there is no way to get from an object to its ExternalIntegration, and self-test status will not be stored.

classmethod from_config(_db, collection=None)[source]
get_collection_url(endpoint)[source]
property lookup_endpoint
metadata_needed(**kwargs)[source]

Get a feed of items that need additional metadata to be processed by the Metadata Wrangler.

remove(identifiers)[source]

Remove items from an authenticated Metadata Wrangler Collection

updates(last_update_time, **kwargs)[source]

Retrieve updated items from an authenticated Metadata Wrangler Collection

Parameters:

last_update_time – DateTime representing the last time an update was fetched. May be None.

class core.opds_import.MockMetadataWranglerOPDSLookup(*args, **kwargs)[source]

Bases: MockSimplifiedOPDSLookup, MetadataWranglerOPDSLookup

class core.opds_import.MockSimplifiedOPDSLookup(*args, **kwargs)[source]

Bases: SimplifiedOPDSLookup

queue_response(status_code, headers={}, content=None)[source]
class core.opds_import.OPDSImportMonitor(_db, collection, import_class, force_reimport=False, **import_class_kwargs)[source]

Bases: CollectionMonitor, HasSelfTests

Periodically monitor a Collection’s OPDS archive feed and import every title it mentions.

DEFAULT_START_TIME = <object object>
PROTOCOL = 'OPDS Import'
SERVICE_NAME = 'OPDS Import Monitor'
data_source(collection)[source]

Returns the data source name for the given collection.

By default, this URL is stored as a setting on the collection, but subclasses may hard-code it.

external_integration(_db)[source]

Locate the ExternalIntegration associated with this object. The status of the self-tests will be stored as a ConfigurationSetting on this ExternalIntegration.

By default, there is no way to get from an object to its ExternalIntegration, and self-test status will not be stored.

feed_contains_new_data(feed)[source]

Does the given feed contain any entries that haven’t been imported yet?

Download a representation of a URL and extract the useful information.

Returns:

A 2-tuple (next_links, feed). next_links is a list of additional links that need to be followed. feed is the content that needs to be imported.

identifier_needs_import(identifier, last_updated_remote)[source]

Does the remote side have new information about this Identifier?

Parameters:
  • identifier – An Identifier.

  • last_update_remote – The last time the remote side updated the OPDS entry for this Identifier.

import_one_feed(feed)[source]

Import every book mentioned in an OPDS feed.

opds_url(collection)[source]

Returns the OPDS import URL for the given collection.

By default, this URL is stored as the external account ID, but subclasses may override this.

run_once(progress_ignore)[source]

Do the actual work of the Monitor.

Parameters:

progress – A TimestampData representing the work done by the Monitor up to this point.

Returns:

A TimestampData representing how you want the Monitor’s entry in the timestamps table to look like from this point on. NOTE: Modifying the incoming progress and returning it is generally a bad idea, because the incoming progress is full of old data. Instead, return a new TimestampData containing data for only the fields you want to set.

class core.opds_import.OPDSImporter(_db, collection, data_source_name=None, identifier_mapping=None, http_get=None, metadata_client=None, content_modifier=None, map_from_collection=None, mirrors=None)[source]

Bases: object

Imports editions and license pools from an OPDS feed. Creates Edition, LicensePool and Work rows in the database, if those don’t already exist.

Should be used when a circulation server asks for data from our internal content server, and also when our content server asks for data from external content servers.

BASE_SETTINGS = [{'key': 'external_account_id', 'label': l'URL', 'required': True, 'format': 'url'}, {'key': 'data_source', 'label': l'Data source name', 'required': True}, {'key': 'default_audience', 'label': l'Default audience', 'description': l'If the vendor does not specify the target audience for their books, assume the books have this target audience.', 'type': 'select', 'format': 'narrow', 'options': [{'key': '', 'label': l'No default audience'}, {'key': 'Adult', 'label': 'Adult'}, {'key': 'Adults Only', 'label': 'Adults Only'}, {'key': 'All Ages', 'label': 'All Ages'}, {'key': 'Children', 'label': 'Children'}, {'key': 'Research', 'label': 'Research'}, {'key': 'Young Adult', 'label': 'Young Adult'}], 'default': '', 'required': False, 'readOnly': True}]
COULD_NOT_CREATE_LICENSE_POOL = 'No existing license pool for this identifier and no way of creating one.'
DESCRIPTION = l'Import books from a publicly-accessible OPDS feed.'
NAME = 'OPDS Import'
NO_DEFAULT_AUDIENCE = ''
PARSER_CLASS

alias of OPDSXMLParser

SETTINGS = [{'key': 'external_account_id', 'label': l'URL', 'required': True, 'format': 'url'}, {'key': 'data_source', 'label': l'Data source name', 'required': True}, {'key': 'default_audience', 'label': l'Default audience', 'description': l'If the vendor does not specify the target audience for their books, assume the books have this target audience.', 'type': 'select', 'format': 'narrow', 'options': [{'key': '', 'label': l'No default audience'}, {'key': 'Adult', 'label': 'Adult'}, {'key': 'Adults Only', 'label': 'Adults Only'}, {'key': 'All Ages', 'label': 'All Ages'}, {'key': 'Children', 'label': 'Children'}, {'key': 'Research', 'label': 'Research'}, {'key': 'Young Adult', 'label': 'Young Adult'}], 'default': '', 'required': False, 'readOnly': True}, {'key': 'username', 'label': l'Username', 'description': l'If HTTP Basic authentication is required to access the OPDS feed (it usually isn't), enter the username here.'}, {'key': 'password', 'label': l'Password', 'description': l'If HTTP Basic authentication is required to access the OPDS feed (it usually isn't), enter the password here.'}, {'key': 'custom_accept_header', 'label': l'Custom accept header', 'required': False, 'description': l'Some servers expect an accept header to decide which file to send. You can use */* if the server doesn't expect anything.', 'default': 'application/atom+xml;profile=opds-catalog;kind=acquisition,application/atom+xml;q=0.9,application/xml;q=0.8,*/*;q=0.1'}, {'key': 'primary_identifier_source', 'label': l'Identifer', 'required': False, 'description': l'Which book identifier to use as ID.', 'type': 'select', 'options': [{'key': '', 'label': l'(Default) Use <id>'}, {'key': 'first_dcterms_identifier', 'label': l'Use <dcterms:identifier> first, if not exist use <id>'}]}]
SUCCESS_STATUS_CODES = None
assert_importable_content(feed, feed_url, max_get_attempts=5)[source]

Raise an exception if the given feed contains nothing that can, even theoretically, be turned into a LicensePool.

By default, this means the feed must link to open-access content that can actually be retrieved.

build_identifier_mapping(external_urns)[source]

Uses the given Collection and a list of URNs to reverse engineer an identifier mapping.

NOTE: It would be better if .identifier_mapping weren’t instance data, since a single OPDSImporter might import multiple pages of a feed. However, the code as written should work.

property collection

Returns an associated Collection object

Returns:

Associated Collection object

Return type:

Optional[core.model.collection.Collection]

classmethod combine(d1, d2)[source]

Combine two dictionaries that can be used as keyword arguments to the Metadata constructor.

Try to match up links with their thumbnails.

If link n is an image and link n+1 is a thumbnail, then the thumbnail is assumed to be the thumbnail of the image.

Similarly if link n is a thumbnail and link n+1 is an image.

classmethod coveragefailure_from_message(data_source, message)[source]

Turn a <simplified:message> tag into a CoverageFailure.

classmethod coveragefailures_from_messages(data_source, parser, feed_tag)[source]

Extract CoverageFailure objects from a parsed OPDS document. This allows us to determine the fate of books which could not become <entry> tags.

classmethod data_detail_for_feedparser_entry(entry, data_source)[source]

Turn an entry dictionary created by feedparser into dictionaries of data that can be used as keyword arguments to the Metadata and CirculationData constructors.

Returns:

A 3-tuple (identifier, kwargs for Metadata constructor, failure)

property data_source

Look up or create a DataSource object representing the source of this OPDS feed.

classmethod detail_for_elementtree_entry(parser, entry_tag, data_source, feed_url=None, do_get=None)[source]

Turn an <atom:entry> tag into a dictionary of metadata that can be used as keyword arguments to the Metadata contructor.

Returns:

A 2-tuple (identifier, kwargs)

classmethod extract_contributor(parser, author_tag)[source]

Turn an <atom:author> tag into a ContributorData object.

extract_data_from_feedparser(feed, data_source)[source]
extract_feed_data(feed, feed_url=None)[source]

Turn an OPDS feed into lists of Metadata and CirculationData objects, with associated messages and next_links.

classmethod extract_identifier(identifier_tag)[source]

Turn a <dcterms:identifier> tag into an IdentifierData object.

extract_last_update_dates(feed)[source]

Convert a <link> tag into a LinkData object.

Parameters:
  • feed_url – The URL to the enclosing feed, for use in resolving relative links.

  • entry_rights_uri – A URI describing the rights advertised in the entry. Unless this specific link says otherwise, we will assume that the representation on the other end of the link if made available on these terms.

classmethod extract_measurement(rating_tag)[source]
classmethod extract_medium(entry_tag, default='Book')[source]

Derive a value for Edition.medium from schema:additionalType or from a <dcterms:format> subtag.

Parameters:
  • entry_tag – A <atom:entry> tag.

  • default – The value to use if nothing is found.

classmethod extract_messages(parser, feed_tag)[source]

Extract <simplified:message> tags from an OPDS feed and convert them into OPDSMessage objects.

classmethod extract_metadata_from_elementtree(feed, data_source, feed_url=None, do_get=None)[source]

Parse the OPDS as XML and extract all author and subject information, as well as ratings and medium.

All the stuff that Feedparser can’t handle so we have to use lxml.

Returns:

a dictionary mapping IDs to dictionaries. The inner dictionary can be used as keyword arguments to the Metadata constructor.

classmethod extract_series(series_tag)[source]
classmethod extract_subject(parser, category_tag)[source]

Turn an <atom:category> tag into a SubjectData object.

Get medium if derivable from information in an acquisition link.

handle_failure(urn, failure)[source]

Convert a URN and a failure message that came in through an OPDS feed into an Identifier and a CoverageFailure object.

The Identifier may not be the one designated by urn (if it’s found in self.identifier_mapping) and the ‘failure’ may turn out not to be a CoverageFailure at all – if it’s an Identifier, that means that what a normal OPDSImporter would consider ‘failure’ is considered success.

import_edition_from_metadata(metadata)[source]

For the passed-in Metadata object, see if can find or create an Edition in the database. Also create a LicensePool if the Metadata has CirculationData in it.

import_from_feed(feed, feed_url=None)[source]
last_update_date_for_feedparser_entry(entry)[source]

Hook method for creating a LinkData object.

Intended to be overridden in subclasses.

classmethod rights_uri(rights_string)[source]

Determine the URI that best encapsulates the rights status of the downloads associated with this book.

classmethod rights_uri_from_entry_tag(entry)[source]

Extract a rights string from an lxml <entry> tag.

Returns:

A rights URI.

classmethod rights_uri_from_feedparser_entry(entry)[source]

Extract a rights URI from a parsed feedparser entry.

Returns:

A rights URI.

update_work_for_edition(edition)[source]

If possible, ensure that there is a presentation-ready Work for the given edition’s primary identifier.

class core.opds_import.OPDSXMLParser[source]

Bases: XMLParser

NAMESPACES = {'app': 'http://www.w3.org/2007/app', 'atom': 'http://www.w3.org/2005/Atom', 'dc': 'http://purl.org/dc/elements/1.1/', 'dcterms': 'http://purl.org/dc/terms/', 'drm': 'http://librarysimplified.org/terms/drm', 'opds': 'http://opds-spec.org/2010/catalog', 'schema': 'http://schema.org/', 'simplified': 'http://librarysimplified.org/terms/'}
class core.opds_import.SimplifiedOPDSLookup(base_url)[source]

Bases: object

Tiny integration class for the Simplified ‘lookup’ protocol.

LOOKUP_ENDPOINT = 'lookup'
classmethod check_content_type(response)[source]
classmethod from_protocol(_db, protocol, goal='licenses', library=None)[source]
lookup(identifiers)[source]

Retrieve an OPDS feed with metadata for the given identifiers.

property lookup_endpoint
urn_args(identifiers)[source]
core.opds_import.parse_identifier(db, identifier)[source]

Parse the identifier and return an Identifier object representing it.

Parameters:
  • db (sqlalchemy.orm.session.Session) – Database session

  • identifier (str) – String containing the identifier

Returns:

Identifier object

Return type:

core.model.identifier.Identifier

core.opensearch module

class core.opensearch.OpenSearchDocument[source]

Bases: object

Generates OpenSearch documents.

TEMPLATE = '<?xml version="1.0" encoding="UTF-8"?>\n <OpenSearchDescription xmlns="http://a9.com/-/spec/opensearch/1.1/">\n   <ShortName>%(name)s</ShortName>\n   <Description>%(description)s</Description>\n   <Tags>%(tags)s</Tags>\n   <Url type="application/atom+xml;profile=opds-catalog" template="%(url_template)s"/>\n </OpenSearchDescription>'
classmethod escape_entities(info)[source]

Escape ampersands in the given dictionary’s values.

classmethod for_lane(lane, base_url)[source]
classmethod search_info(lane)[source]
classmethod url_template(base_url)[source]

Turn a base URL into an OpenSearch URL template.

core.overdrive module

class core.overdrive.MockOverdriveAPI(_db, collection, *args, **kwargs)[source]

Bases: OverdriveAPI

mock_access_token_response(credential)[source]
classmethod mock_collection(_db, library=None, name='Test Overdrive Collection', client_key='a', client_secret='b', library_id='c', website_id='d', ils_name='e')[source]

Create a mock Overdrive collection for use in tests.

mock_collection_token(token)[source]
queue_collection_token()[source]
queue_response(status_code, headers={}, content=None)[source]
token_post(url, payload, headers={}, **kwargs)[source]

Mock the request for an OAuth token.

We mock the method by looking at the access_token_response property, rather than inserting a mock response in the queue, because only the first MockOverdriveAPI instantiation in a given test actually makes this call. By mocking the response to this method separately we remove the need to figure out whether to queue a response in a given test.

class core.overdrive.OverdriveAPI(_db, collection)[source]

Bases: object

ADVANTAGE_LIBRARY_ENDPOINT = '%(host)s/v1/libraries/%(parent_library_id)s/advantageAccounts/%(library_id)s'
ALL_PRODUCTS_ENDPOINT = '%(host)s/v1/collections/%(collection_token)s/products?sort=%(sort)s'
AVAILABILITY_ENDPOINT = '%(host)s/v2/collections/%(collection_token)s/products/%(product_id)s/availability'
CHECKOUTS_ENDPOINT = '%(patron_host)s/v1/patrons/me/checkouts'
CHECKOUT_ENDPOINT = '%(patron_host)s/v1/patrons/me/checkouts/%(overdrive_id)s'
DEFAULT_READABLE_FORMATS = {'audiobook-overdrive', 'ebook-epub-adobe', 'ebook-epub-open', 'ebook-pdf-open'}
EVENTS_ENDPOINT = '%(host)s/v1/collections/%(collection_token)s/products?lastUpdateTime=%(lastupdatetime)s&sort=%(sort)s&limit=%(limit)s'
EVENT_DELAY = datetime.timedelta(seconds=7200)
EVENT_SOURCE = 'Overdrive'
FORMATS = ['ebook-epub-open', 'ebook-epub-adobe', 'ebook-pdf-adobe', 'ebook-pdf-open', 'audiobook-overdrive']
FORMATS_ENDPOINT = '%(patron_host)s/v1/patrons/me/checkouts/%(overdrive_id)s/formats'
HOLDS_ENDPOINT = '%(patron_host)s/v1/patrons/me/holds'
HOLD_ENDPOINT = '%(patron_host)s/v1/patrons/me/holds/%(product_id)s'
HOSTS = {'production': {'host': 'https://api.overdrive.com', 'oauth_host': 'https://oauth.overdrive.com', 'oauth_patron_host': 'https://oauth-patron.overdrive.com', 'patron_host': 'https://patron.api.overdrive.com'}, 'testing': {'host': 'https://integration.api.overdrive.com', 'oauth_host': 'https://oauth.overdrive.com', 'oauth_patron_host': 'https://oauth-patron.overdrive.com', 'patron_host': 'https://integration-patron.api.overdrive.com'}}
ILS_NAME_DEFAULT = 'default'
ILS_NAME_KEY = 'ils_name'
INCOMPATIBLE_PLATFORM_FORMATS = {'ebook-kindle'}
LIBRARY_ENDPOINT = '%(host)s/v1/libraries/%(library_id)s'
MAX_CREDENTIAL_AGE = 3000
METADATA_ENDPOINT = '%(host)s/v1/collections/%(collection_token)s/products/%(item_id)s/metadata'
ME_ENDPOINT = '%(patron_host)s/v1/patrons/me'
OVERDRIVE_READ_FORMAT = 'ebook-overdrive'
PAGE_SIZE_LIMIT = 300
PATRON_INFORMATION_ENDPOINT = '%(patron_host)s/v1/patrons/me'
PATRON_TOKEN_ENDPOINT = '%(oauth_patron_host)s/patrontoken'
PRODUCTION_SERVERS = 'production'
SERVER_NICKNAME = 'server_nickname'
TESTING_SERVERS = 'testing'
TIME_FORMAT = '%Y-%m-%dT%H:%M:%SZ'
TOKEN_ENDPOINT = '%(oauth_host)s/token'
WEBSITE_ID = 'website_id'
property advantage_library_id

The library ID for this library, as we should look for it in certain API documents served by Overdrive.

For ordinary collections, and for consortial collections shared among libraries, this will be -1.

For Overdrive Advantage accounts, this will be the numeric value of the Overdrive library ID.

all_ids()[source]

Get IDs for every book in the system, with the most recently added ones at the front.

check_creds(force_refresh=False)[source]

If the Bearer Token has expired, update it.

property collection
property collection_token

Get the token representing this particular Overdrive collection.

As a side effect, this will verify that the Overdrive credentials are working.

credential_object(refresh)[source]

Look up the Credential object that allows us to use the Overdrive API.

endpoint(url, **kwargs)[source]

Create the URL to an Overdrive API endpoint.

Parameters:
  • url – A template for the URL.

  • kwargs – Arguments to be interpolated into the template. The server hostname will be interpolated automatically; you don’t have to pass it in.

get(url, extra_headers, exception_on_401=False)[source]

Make an HTTP GET request using the active Bearer Token.

get_advantage_accounts()[source]

Find all the Overdrive Advantage accounts managed by this library.

Yield:

A sequence of OverdriveAdvantageAccount objects.

get_library()[source]

Get basic information about the collection, including a link to the titles in the collection.

host = {'host': 'https://integration.api.overdrive.com', 'oauth_host': 'https://oauth.overdrive.com', 'oauth_patron_host': 'https://oauth-patron.overdrive.com', 'patron_host': 'https://integration-patron.api.overdrive.com'}
ils_name(library)[source]

Determine the ILS name to use for the given Library.

classmethod ils_name_setting(_db, collection, library)[source]

Find the ConfigurationSetting controlling the ILS name for the given collection and library.

lock = <unlocked _thread.RLock object owner=0 count=0>
log = <Logger Overdrive API (WARNING)>

Turn a server-provided link into a link the server will accept!

The {} part is completely obnoxious and I have complained about it to Overdrive.

The availability part is to make sure we always use v2 of the availability API, even if Overdrive sent us a link to v1.

metadata_lookup(identifier)[source]

Look up metadata for an Overdrive identifier.

metadata_lookup_obj(identifier)[source]
recently_changed_ids(start, cutoff)[source]

Get IDs of books whose status has changed between the start time and now.

refresh_creds(credential)[source]

Fetch a new Bearer Token and update the given Credential object.

property source
property token
property token_authorization_header
token_post(url, payload, headers={}, **kwargs)[source]

Make an HTTP POST request for purposes of getting an OAuth token.

class core.overdrive.OverdriveAdvantageAccount(parent_library_id, library_id, name)[source]

Bases: object

Holder and parser for data associated with Overdrive Advantage.

classmethod from_representation(content)[source]

Turn the representation of an advantageAccounts link into a list of OverdriveAdvantageAccount objects.

Parameters:

content – The data obtained by following an advantageAccounts link.

Yield:

A sequence of OverdriveAdvantageAccount objects.

to_collection(_db)[source]

Find or create a Collection object for this Overdrive Advantage account.

Returns:

a 2-tuple of Collections (primary Overdrive collection, Overdrive Advantage collection)

class core.overdrive.OverdriveBibliographicCoverageProvider(collection, api_class=<class 'core.overdrive.OverdriveAPI'>, **kwargs)[source]

Bases: BibliographicCoverageProvider

Fill in bibliographic metadata for Overdrive records.

This will occasionally fill in some availability information for a single Collection, but we rely on Monitors to keep availability information up to date for all Collections.

DATA_SOURCE_NAME = 'Overdrive'
INPUT_IDENTIFIER_TYPES = 'Overdrive ID'
PROTOCOL = 'Overdrive'
SERVICE_NAME = 'Overdrive Bibliographic Coverage Provider'
metadata_pre_hook(metadata)[source]

A hook method that allows subclasses to modify a Metadata object derived from Overdrive before it’s applied.

process_item(identifier)[source]

Do the work necessary to give coverage to one specific item.

Since this is where the actual work happens, this is not implemented in IdentifierCoverageProvider or WorkCoverageProvider, and must be handled in a subclass.

class core.overdrive.OverdriveRepresentationExtractor(api)[source]

Bases: object

Extract useful information from Overdrive’s JSON representations.

DATE_FORMAT = '%Y-%m-%d'
Returns:

A list of dictionaries with keys id, title, availability_link.

book_info_to_circulation(book)[source]

Note: The json data passed into this method is from a different file/stream from the json data that goes into the book_info_to_metadata() method.

classmethod book_info_to_metadata(book, include_bibliographic=True, include_formats=True)[source]

Turn Overdrive’s JSON representation of a book into a Metadata object.

Note: The json data passed into this method is from a different file/stream from the json data that goes into the book_info_to_circulation() method.

format_data_for_overdrive_format = {'audiobook-mp3': ('application/x-od-media', 'Overdrive DRM'), 'audiobook-overdrive': [('application/vnd.overdrive.circulation.api+json;profile=audiobook', 'Libby DRM'), ('Streaming Audio', 'Streaming')], 'ebook-epub-adobe': ('application/epub+zip', 'application/vnd.adobe.adept+xml'), 'ebook-epub-open': ('application/epub+zip', None), 'ebook-kindle': ('Kindle via Amazon', 'Kindle DRM'), 'ebook-overdrive': [('application/vnd.overdrive.circulation.api+json;profile=ebook', 'Libby DRM'), ('Streaming Text', 'Streaming')], 'ebook-pdf-adobe': ('application/pdf', 'application/vnd.adobe.adept+xml'), 'ebook-pdf-open': ('application/pdf', None), 'music-mp3': ('application/x-od-media', 'Overdrive DRM'), 'periodicals-nook': ('Nook via B&N', 'Nook DRM'), 'video-streaming': ('Streaming Video', 'Streaming')}
ignorable_overdrive_formats = {}
classmethod internal_formats(overdrive_format)[source]

Yield all internal formats for the given Overdrive format.

Some Overdrive formats become multiple internal formats.

Yield:

A sequence of (content type, DRM system) 2-tuples

log = <Logger Overdrive representation extractor (WARNING)>
overdrive_medium_to_simplified_medium = {'Audiobook': 'Audio', 'Music': 'Music', 'Periodicals': 'Periodical', 'Video': 'Video', 'eBook': 'Book'}
overdrive_role_to_simplified_role = {'actor': 'Actor', 'adapter': 'Adapter', 'artist': 'Artist', 'associated name': 'Associated name', 'author': 'Author', 'author of afterword': 'Afterword Author', 'author of foreword': 'Foreword Author', 'author of introduction': 'Introduction Author', 'book producer': 'Producer', 'cast member': 'Actor', 'collaborator': 'Collaborator', 'colophon': 'Colophon Author', 'compiler': 'Compiler', 'composer': 'Composer', 'contributor': 'Contributor', 'copyright holder': 'Copyright holder', 'designer': 'Designer', 'director': 'Director', 'editor': 'Editor', 'engineer': 'Engineer', 'etc.': 'Unknown', 'executive producer': 'Executive Producer', 'illustrator': 'Illustrator', 'lyricist': 'Lyricist', 'musician': 'Musician', 'narrator': 'Narrator', 'other': 'Unknown', 'performer': 'Performer', 'photographer': 'Photographer', 'producer': 'Producer', 'transcriber': 'Transcriber', 'translator': 'Translator'}
classmethod parse_roles(id, rolestring)[source]

core.problem_details module

core.s3 module

class core.s3.MinIOUploader(integration, client_class=None)[source]

Bases: S3Uploader

NAME = 'MinIO'
SETTINGS = [{'key': 'username', 'label': l'Access Key', 'description': '', 'type': None, 'required': False, 'default': None, 'options': None, 'category': None}, {'key': 'password', 'label': l'Secret Key', 'description': l'If the <em>Access Key</em> and <em>Secret Key</em> are not given here credentials will be used as outlined in the <a href="https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html#configuring-credentials">Boto3 documenation</a>. If <em>Access Key</em> is given, <em>Secrent Key</em> must also be given.', 'type': None, 'required': False, 'default': None, 'options': None, 'category': None}, {'key': 'book_covers_bucket', 'label': l'Book Covers Bucket', 'description': l'All book cover images encountered will be mirrored to this S3 bucket. Large images will be scaled down, and the scaled-down copies will also be uploaded to this bucket. <p>The bucket must already exist&mdash;it will not be created automatically.</p>', 'type': None, 'required': False, 'default': None, 'options': None, 'category': None}, {'key': 'open_access_content_bucket', 'label': l'Open Access Content Bucket', 'description': l'All open-access books encountered will be uploaded to this S3 bucket. <p>The bucket must already exist&mdash;it will not be created automatically.</p>', 'type': None, 'required': False, 'default': None, 'options': None, 'category': None}, {'key': 'protected_content_bucket', 'label': l'Protected Access Content Bucket', 'description': l'Self-hosted books will be uploaded to this S3 bucket. <p>The bucket must already exist&mdash;it will not be created automatically.</p>', 'type': None, 'required': False, 'default': None, 'options': None, 'category': None}, {'key': 'marc_bucket', 'label': l'MARC File Bucket', 'description': l'All generated MARC files will be uploaded to this S3 bucket. <p>The bucket must already exist&mdash;it will not be created automatically.</p>', 'type': None, 'required': False, 'default': None, 'options': None, 'category': None}, {'key': 's3_region', 'label': l'S3 region', 'description': l'S3 region which will be used for storing the content.', 'type': 'select', 'required': False, 'default': 'us-east-1', 'options': [{'key': 'af-south-1', 'label': 'af-south-1'}, {'key': 'ap-east-1', 'label': 'ap-east-1'}, {'key': 'ap-northeast-1', 'label': 'ap-northeast-1'}, {'key': 'ap-northeast-2', 'label': 'ap-northeast-2'}, {'key': 'ap-northeast-3', 'label': 'ap-northeast-3'}, {'key': 'ap-south-1', 'label': 'ap-south-1'}, {'key': 'ap-southeast-1', 'label': 'ap-southeast-1'}, {'key': 'ap-southeast-2', 'label': 'ap-southeast-2'}, {'key': 'ap-southeast-3', 'label': 'ap-southeast-3'}, {'key': 'ca-central-1', 'label': 'ca-central-1'}, {'key': 'eu-central-1', 'label': 'eu-central-1'}, {'key': 'eu-north-1', 'label': 'eu-north-1'}, {'key': 'eu-south-1', 'label': 'eu-south-1'}, {'key': 'eu-west-1', 'label': 'eu-west-1'}, {'key': 'eu-west-2', 'label': 'eu-west-2'}, {'key': 'eu-west-3', 'label': 'eu-west-3'}, {'key': 'me-south-1', 'label': 'me-south-1'}, {'key': 'sa-east-1', 'label': 'sa-east-1'}, {'key': 'us-east-1', 'label': 'us-east-1'}, {'key': 'us-east-2', 'label': 'us-east-2'}, {'key': 'us-west-1', 'label': 'us-west-1'}, {'key': 'us-west-2', 'label': 'us-west-2'}], 'category': None}, {'key': 's3_addressing_style', 'label': l'S3 addressing style', 'description': l'Buckets created after September 30, 2020, will support only virtual hosted-style requests. Path-style requests will continue to be supported for buckets created on or before this date. For more information, see <a href="https://aws.amazon.com/blogs/aws/amazon-s3-path-deprecation-plan-the-rest-of-the-story/">Amazon S3 Path Deprecation Plan - The Rest of the Story</a>.', 'type': 'select', 'required': False, 'default': 'us-east-1', 'options': [{'key': 'virtual', 'label': l'Virtual'}, {'key': 'path', 'label': l'Path'}, {'key': 'auto', 'label': l'Auto'}], 'category': None}, {'key': 's3_presigned_url_expiration', 'label': l'S3 presigned URL expiration', 'description': l'Time in seconds for the presigned URL to remain valid', 'type': 'number', 'required': False, 'default': 3600, 'options': None, 'category': None}, {'key': 'bucket_name_transform', 'label': l'URL format', 'description': l'A file mirrored to S3 is available at <code>http://{bucket}.s3.{region}.amazonaws.com/{filename}</code>. If you've set up your DNS so that http://[bucket]/ or https://[bucket]/ points to the appropriate S3 bucket, you can configure this S3 integration to shorten the URLs. <p>If you haven't set up your S3 buckets, don't change this from the default -- you'll get URLs that don't work.</p>', 'type': 'select', 'required': False, 'default': 'identity', 'options': [{'key': 'identity', 'label': l'S3 Default: https://{bucket}.s3.{region}.amazonaws.com/{file}'}, {'key': 'https', 'label': l'HTTPS: https://{bucket}/{file}'}, {'key': 'http', 'label': l'HTTP: http://{bucket}/{file}'}], 'category': None}, {'key': 'ENDPOINT_URL', 'label': l'Endpoint URL', 'description': l'MinIO's endpoint URL', 'type': None, 'required': True, 'default': None, 'options': None, 'category': None, 'format': None}]
class core.s3.MinIOUploaderConfiguration(configuration_storage, db)[source]

Bases: ConfigurationGrouping

ENDPOINT_URL = 'ENDPOINT_URL'
endpoint_url

Contains configuration metadata

class core.s3.MockS3Client(service, region_name, aws_access_key_id, aws_secret_access_key, config=None)[source]

Bases: object

This pool lets us test the real S3Uploader class with a mocked-up boto3 client.

abort_multipart_upload(**kwargs)[source]
complete_multipart_upload(**kwargs)[source]
create_multipart_upload(**kwargs)[source]
generate_presigned_url(ClientMethod, Params=None, ExpiresIn=3600, HttpMethod=None)[source]
upload_fileobj(Fileobj, Bucket, Key, ExtraArgs=None, **kwargs)[source]
upload_part(**kwargs)[source]
class core.s3.MockS3Uploader(fail=False, *args, **kwargs)[source]

Bases: S3Uploader

A dummy uploader for use in tests.

buckets = {'book_covers_bucket': 'test-cover-bucket', 'marc_bucket': 'test-marc-bucket', 'open_access_content_bucket': 'test-content-bucket', 'protected_content_bucket': 'test-content-bucket'}
mirror_one(representation, **kwargs)[source]

Mirror a single representation to the given URL.

Parameters:
multipart_upload(representation, mirror_to)[source]
class core.s3.MultipartS3Upload(uploader, representation, mirror_to)[source]

Bases: object

abort()[source]
complete()[source]
upload_part(content)[source]
class core.s3.S3AddressingStyle(value)[source]

Bases: Enum

Enumeration of different addressing styles supported by boto

AUTO = 'auto'
PATH = 'path'
VIRTUAL = 'virtual'
class core.s3.S3Uploader(integration, client_class=None, host='amazonaws.com')[source]

Bases: MirrorUploader

NAME = 'Amazon S3'
S3_HOST = 'amazonaws.com'
SETTINGS = [{'key': 'username', 'label': l'Access Key', 'description': '', 'type': None, 'required': False, 'default': None, 'options': None, 'category': None}, {'key': 'password', 'label': l'Secret Key', 'description': l'If the <em>Access Key</em> and <em>Secret Key</em> are not given here credentials will be used as outlined in the <a href="https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html#configuring-credentials">Boto3 documenation</a>. If <em>Access Key</em> is given, <em>Secrent Key</em> must also be given.', 'type': None, 'required': False, 'default': None, 'options': None, 'category': None}, {'key': 'book_covers_bucket', 'label': l'Book Covers Bucket', 'description': l'All book cover images encountered will be mirrored to this S3 bucket. Large images will be scaled down, and the scaled-down copies will also be uploaded to this bucket. <p>The bucket must already exist&mdash;it will not be created automatically.</p>', 'type': None, 'required': False, 'default': None, 'options': None, 'category': None}, {'key': 'open_access_content_bucket', 'label': l'Open Access Content Bucket', 'description': l'All open-access books encountered will be uploaded to this S3 bucket. <p>The bucket must already exist&mdash;it will not be created automatically.</p>', 'type': None, 'required': False, 'default': None, 'options': None, 'category': None}, {'key': 'protected_content_bucket', 'label': l'Protected Access Content Bucket', 'description': l'Self-hosted books will be uploaded to this S3 bucket. <p>The bucket must already exist&mdash;it will not be created automatically.</p>', 'type': None, 'required': False, 'default': None, 'options': None, 'category': None}, {'key': 'marc_bucket', 'label': l'MARC File Bucket', 'description': l'All generated MARC files will be uploaded to this S3 bucket. <p>The bucket must already exist&mdash;it will not be created automatically.</p>', 'type': None, 'required': False, 'default': None, 'options': None, 'category': None}, {'key': 's3_region', 'label': l'S3 region', 'description': l'S3 region which will be used for storing the content.', 'type': 'select', 'required': False, 'default': 'us-east-1', 'options': [{'key': 'af-south-1', 'label': 'af-south-1'}, {'key': 'ap-east-1', 'label': 'ap-east-1'}, {'key': 'ap-northeast-1', 'label': 'ap-northeast-1'}, {'key': 'ap-northeast-2', 'label': 'ap-northeast-2'}, {'key': 'ap-northeast-3', 'label': 'ap-northeast-3'}, {'key': 'ap-south-1', 'label': 'ap-south-1'}, {'key': 'ap-southeast-1', 'label': 'ap-southeast-1'}, {'key': 'ap-southeast-2', 'label': 'ap-southeast-2'}, {'key': 'ap-southeast-3', 'label': 'ap-southeast-3'}, {'key': 'ca-central-1', 'label': 'ca-central-1'}, {'key': 'eu-central-1', 'label': 'eu-central-1'}, {'key': 'eu-north-1', 'label': 'eu-north-1'}, {'key': 'eu-south-1', 'label': 'eu-south-1'}, {'key': 'eu-west-1', 'label': 'eu-west-1'}, {'key': 'eu-west-2', 'label': 'eu-west-2'}, {'key': 'eu-west-3', 'label': 'eu-west-3'}, {'key': 'me-south-1', 'label': 'me-south-1'}, {'key': 'sa-east-1', 'label': 'sa-east-1'}, {'key': 'us-east-1', 'label': 'us-east-1'}, {'key': 'us-east-2', 'label': 'us-east-2'}, {'key': 'us-west-1', 'label': 'us-west-1'}, {'key': 'us-west-2', 'label': 'us-west-2'}], 'category': None}, {'key': 's3_addressing_style', 'label': l'S3 addressing style', 'description': l'Buckets created after September 30, 2020, will support only virtual hosted-style requests. Path-style requests will continue to be supported for buckets created on or before this date. For more information, see <a href="https://aws.amazon.com/blogs/aws/amazon-s3-path-deprecation-plan-the-rest-of-the-story/">Amazon S3 Path Deprecation Plan - The Rest of the Story</a>.', 'type': 'select', 'required': False, 'default': 'us-east-1', 'options': [{'key': 'virtual', 'label': l'Virtual'}, {'key': 'path', 'label': l'Path'}, {'key': 'auto', 'label': l'Auto'}], 'category': None}, {'key': 's3_presigned_url_expiration', 'label': l'S3 presigned URL expiration', 'description': l'Time in seconds for the presigned URL to remain valid', 'type': 'number', 'required': False, 'default': 3600, 'options': None, 'category': None}, {'key': 'bucket_name_transform', 'label': l'URL format', 'description': l'A file mirrored to S3 is available at <code>http://{bucket}.s3.{region}.amazonaws.com/{filename}</code>. If you've set up your DNS so that http://[bucket]/ or https://[bucket]/ points to the appropriate S3 bucket, you can configure this S3 integration to shorten the URLs. <p>If you haven't set up your S3 buckets, don't change this from the default -- you'll get URLs that don't work.</p>', 'type': 'select', 'required': False, 'default': 'identity', 'options': [{'key': 'identity', 'label': l'S3 Default: https://{bucket}.s3.{region}.amazonaws.com/{file}'}, {'key': 'https', 'label': l'HTTPS: https://{bucket}/{file}'}, {'key': 'http', 'label': l'HTTP: http://{bucket}/{file}'}], 'category': None}]
SITEWIDE = True
book_url(identifier, extension='.epub', open_access=True, data_source=None, title=None)[source]

The path to the hosted EPUB file for the given identifier.

content_root(bucket)[source]

The root URL to the S3 location of hosted content of the given type.

cover_image_root(bucket, data_source, scaled_size=None)[source]

The root URL to the S3 location of cover images for the given data source.

cover_image_url(data_source, identifier, filename, scaled_size=None)[source]

The path to the hosted cover image for the given identifier.

final_mirror_url(bucket, key)[source]

Determine the URL to pass into Representation.set_as_mirrored, assuming that it was successfully uploaded to the given bucket as key.

Depending on ExternalIntegration configuration this may be any of the following:

https://{bucket}.s3.{region}.amazonaws.com/{key} http://{bucket}/{key} https://{bucket}/{key}

get_bucket(bucket_key)[source]

Gets the bucket for a particular use based on the given key

classmethod key_join(key, encode=True)[source]

Quote the path portions of an S3 key while leaving the path characters themselves alone.

Parameters:

key – Either a key, or a list of parts to be assembled into a key.

Returns:

A string that can be used as an S3 key.

marc_file_root(bucket, library)[source]
marc_file_url(library, lane, end_time, start_time=None)[source]

The path to the hosted MARC file for the given library, lane, and date range.

mirror_one(representation, mirror_to, collection=None)[source]

Mirror a single representation to the given URL.

Parameters:
multipart_upload(representation, mirror_to, upload_class=<class 'core.s3.MultipartS3Upload'>)[source]
sign_url(url, expiration=None)[source]

Signs a URL and make it expirable

Parameters:
  • url (string) – URL

  • expiration (int) – (Optional) Time in seconds for the presigned URL to remain valid. If it’s empty, S3_PRESIGNED_URL_EXPIRATION configuration setting is used

Returns:

Signed expirable link

Return type:

string

split_url(url, unquote=True)[source]

Splits the URL into the components: bucket and file path

Parameters:
  • url (string) – URL

  • unquote (bool) – Boolean value indicating whether it’s required to unquote URL elements

Returns:

Tuple (bucket, file path)

Return type:

Tuple[string, string]

url(bucket, path)[source]

The URL to a resource on S3 identified by bucket and path.

class core.s3.S3UploaderConfiguration(configuration_storage, db)[source]

Bases: ConfigurationGrouping

BOOK_COVERS_BUCKET_KEY = 'book_covers_bucket'
MARC_BUCKET_KEY = 'marc_bucket'
OA_CONTENT_BUCKET_KEY = 'open_access_content_bucket'
PROTECTED_CONTENT_BUCKET_KEY = 'protected_content_bucket'
S3_ADDRESSING_STYLE = 's3_addressing_style'
S3_DEFAULT_ADDRESSING_STYLE = 'virtual'
S3_DEFAULT_PRESIGNED_URL_EXPIRATION = 3600
S3_DEFAULT_REGION = 'us-east-1'
S3_PRESIGNED_URL_EXPIRATION = 's3_presigned_url_expiration'
S3_REGION = 's3_region'
URL_TEMPLATES_BY_TEMPLATE = {'http': 'http://%(bucket)s/%(key)s', 'https': 'https://%(bucket)s/%(key)s', 'identity': 'https://%(bucket)s.s3.%(region)s/%(key)s'}
URL_TEMPLATE_DEFAULT = 'identity'
URL_TEMPLATE_HTTP = 'http'
URL_TEMPLATE_HTTPS = 'https'
URL_TEMPLATE_KEY = 'bucket_name_transform'
access_key

Contains configuration metadata

book_covers_bucket

Contains configuration metadata

marc_file_bucket

Contains configuration metadata

open_access_content_bucket

Contains configuration metadata

protected_access_content_bucket

Contains configuration metadata

s3_addressing_style

Contains configuration metadata

s3_presigned_url_expiration

Contains configuration metadata

s3_region

Contains configuration metadata

secret_key

Contains configuration metadata

url_template

Contains configuration metadata

core.scripts module

class core.scripts.AddClassificationScript(_db=None, cmd_args=None, stdin=<_io.TextIOWrapper name='<stdin>' mode='r' encoding='utf-8'>)[source]

Bases: IdentifierInputScript

classmethod arg_parser()[source]
do_run()[source]
name = 'Add a classification to an identifier'
class core.scripts.CheckContributorNamesInDB(_db=None, cmd_args=None, stdin=<_io.TextIOWrapper name='<stdin>' mode='r' encoding='utf-8'>)[source]

Bases: IdentifierInputScript

Checks that contributor sort_names are display_names in “last name, comma, other names” format.

Read contributors edition by edition, so that can, if necessary, restrict db query by passed-in identifiers, and so can find associated license pools to register author complaints to.

NOTE: There’s also CheckContributorNamesOnWeb in metadata, it’s a child of this script. Use it to check our knowledge against viaf, with the newer better sort_name selection and formatting.

TODO: make sure don’t start at beginning again when interrupt while batch job is running.

COMPLAINT_SOURCE = 'CheckContributorNamesInDB'
COMPLAINT_TYPE = 'http://librarysimplified.org/terms/problem/wrong-author'
do_run(batch_size=10)[source]
classmethod make_query(_db, identifier_type, identifiers, log=None)[source]
process_contribution_local(_db, contribution, log=None)[source]
process_local_mismatch(_db, contribution, computed_sort_name, error_message_detail, log=None)[source]

Determines if a problem is to be investigated further or recorded as a Complaint, to be solved by a human. In this class, it’s always a complaint. In the overridden method in the child class in metadata_wrangler code, we sometimes go do a web query.

classmethod register_problem(source, contribution, computed_sort_name, error_message_detail, log=None)[source]

Make a Complaint in the database, so a human can take a look at this Contributor’s name and resolve whatever the complex issue that got us here.

classmethod set_contributor_sort_name(sort_name, contribution)[source]

Sets the contributor.sort_name and associated edition.author_name to the passed-in value.

class core.scripts.CollectionArgumentsScript(_db=None)[source]

Bases: CollectionInputScript

classmethod arg_parser()[source]
class core.scripts.CollectionInputScript(_db=None)[source]

Bases: Script

A script that takes collection names as command line inputs.

classmethod arg_parser()[source]
classmethod look_up_collections(_db, parsed, *args, **kwargs)[source]

Turn collection names as specified on the command line into real database Collection objects.

classmethod parse_command_line(_db=None, cmd_args=None, *args, **kwargs)[source]
class core.scripts.CollectionType(value)[source]

Bases: Enum

An enumeration.

LCP = 'LCP'
OPEN_ACCESS = 'OPEN_ACCESS'
PROTECTED_ACCESS = 'PROTECTED_ACCESS'
class core.scripts.ConfigurationSettingScript(_db=None)[source]

Bases: Script

classmethod add_setting_argument(parser, help)[source]

Modify an ArgumentParser to indicate that the script takes command-line settings.

apply_settings(settings, obj)[source]

Treat settings as a list of command-line argument settings, and apply each one to obj.

class core.scripts.ConfigureCollectionScript(_db=None)[source]

Bases: ConfigurationSettingScript

Create a collection or change its settings.

classmethod arg_parser(_db)[source]
do_run(_db=None, cmd_args=None, output=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>)[source]
name = "Change a collection's settings"
classmethod parse_command_line(_db=None, cmd_args=None)[source]
class core.scripts.ConfigureIntegrationScript(_db=None)[source]

Bases: ConfigurationSettingScript

Create a integration or change its settings.

classmethod arg_parser(_db)[source]
do_run(_db=None, cmd_args=None, output=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>)[source]
name = "Create a site-wide integration or change an integration's settings"
classmethod parse_command_line(_db=None, cmd_args=None)[source]
class core.scripts.ConfigureLaneScript(_db=None)[source]

Bases: ConfigurationSettingScript

Create a lane or change its settings.

classmethod arg_parser(_db)[source]
do_run(_db=None, cmd_args=None, output=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>)[source]
name = "Change a lane's settings"
classmethod parse_command_line(_db=None, cmd_args=None)[source]
class core.scripts.ConfigureLibraryScript(_db=None)[source]

Bases: ConfigurationSettingScript

Create a library or change its settings.

classmethod arg_parser()[source]
do_run(_db=None, cmd_args=None, output=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>)[source]
name = "Change a library's settings"
class core.scripts.ConfigureSiteScript(_db=None, config=<class 'core.config.Configuration'>)[source]

Bases: ConfigurationSettingScript

View or update site-wide configuration.

classmethod arg_parser()[source]
do_run(_db=None, cmd_args=None, output=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>)[source]
class core.scripts.CustomListManagementScript(manager_class, data_source_name, list_identifier, list_name, primary_language, description, **manager_kwargs)[source]

Bases: Script

Maintain a CustomList whose membership is determined by a MembershipManager.

run()[source]
class core.scripts.CustomListSweeperScript(_db=None)[source]

Bases: LibraryInputScript

Do something to each custom list in a library.

process_custom_list(custom_list)[source]
process_library(library)[source]
class core.scripts.DatabaseMigrationInitializationScript(*args, **kwargs)[source]

Bases: DatabaseMigrationScript

Creates a timestamp to kickoff the regular use of DatabaseMigrationScript to manage migrations.

classmethod arg_parser()[source]
run(cmd_args=None)[source]
class core.scripts.DatabaseMigrationScript(*args, **kwargs)[source]

Bases: Script

Runs new migrations.

This script needs to execute without ever loading an ORM object, because the database might be in a state that’s not compatible with the current ORM version.

This is not a TimestampScript because it keeps separate Timestamps for the Python and the SQL migrations, and because Timestamps are ORM objects, which this script can’t touch.

DO_NOT_EXECUTE = 'SIMPLYE_MIGRATION_DO_NOT_EXECUTE'
MIGRATION_WITH_COUNTER = re.compile('\\d{8}-(\\d+)-(.)+\\.(py|sql)')
PY_TIMESTAMP_SERVICE_NAME = 'Database Migration - Python'
SERVICE_NAME = 'Database Migration'
TRANSACTIONLESS_COMMANDS = ['alter type']
TRANSACTION_PER_STATEMENT = 'SIMPLYE_MIGRATION_TRANSACTION_PER_STATEMENT'
class TimestampInfo(service, finish, counter=None)[source]

Bases: object

Act like a ORM Timestamp object, but with no database connection.

classmethod find(script, service)[source]

Find or create an existing timestamp representing the last migration script that was run.

Returns:

A TimestampInfo object or None

save(_db)[source]
update(_db, finish, counter, migration_name=None)[source]

Saves a TimestampInfo object to the database.

classmethod arg_parser()[source]
property directories_by_priority

Returns a list containing the migration directory path for core and its container server, organized in priority order (core first)

fetch_migration_files()[source]

Pulls migration files from the expected locations

Returns:

a tuple with a list of migration filenames and a dictionary of those files separated by their absolute directory location.

get_new_migrations(timestamp, migrations)[source]

Return a list of migration filenames, representing migrations created since the timestamp

load_configuration()[source]

Load configuration without accessing the database.

classmethod migratable_files(filelist, extensions)[source]

Filter a list of files for migratable file extensions

property name

Returns the appropriate target Timestamp service name for the timestamp, depending on the script parameters.

property overall_timestamp

Returns a TimestampInfo object corresponding to the the overall or general “Database Migration” service.

If there is no Timestamp or the Timestamp doesn’t have a timestamp attribute, it returns None.

property python_timestamp

Returns a TimestampInfo object corresponding to the python migration- specific “Database Migration - Python” Timestamp.

If there is no Timestamp or the Timestamp hasn’t been initialized with a timestamp attribute, it returns None.

run(test_db=None, test=False, cmd_args=None)[source]
run_migrations(migrations, migrations_by_dir, timestamp)[source]

Run each migration, first by timestamp and then by directory priority.

classmethod sort_migrations(migrations)[source]

All Python migrations sort after all SQL migrations, since a Python migration requires an up-to-date database schema.

Migrations with a counter digit sort after migrations without one.

update_timestamps(migration_file)[source]

Updates this service’s timestamp to match a given migration

class core.scripts.DatabaseVacuum[source]

Bases: Script

Script to vacuum all database tables

Args:

Script (_type_): _description_

do_run(subcommand='')[source]

Run the database vacuum

Args:
subcommand (str, optional):

Can be any of these FULL [ boolean ] FREEZE [ boolean ] VERBOSE [ boolean ] ANALYZE [ boolean ] DISABLE_PAGE_SKIPPING [ boolean ] SKIP_LOCKED [ boolean ] INDEX_CLEANUP { AUTO | ON | OFF } PROCESS_TOAST [ boolean ] TRUNCATE [ boolean ]

class core.scripts.Explain(_db=None)[source]

Bases: IdentifierInputScript

Explain everything known about a given work.

METADATA_URL_TEMPLATE = 'http://metadata.librarysimplified.org/lookup?urn=%s'
TIME_FORMAT = '%Y-%m-%d %H:%M'
do_run(cmd_args=None, stdin=<_io.TextIOWrapper name='<stdin>' mode='r' encoding='utf-8'>, stdout=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>)[source]
explain(_db, edition, presentation_calculation_policy=None)[source]
explain_contribution(contribution)[source]
explain_coverage_record(cr)[source]
explain_identifier(identifier, primary, seen, strength, level)[source]
explain_license_pool(pool)[source]
explain_work(work)[source]
explain_work_coverage_record(cr)[source]
name = 'Explain everything known about a given work'
write(s)[source]

Write a string to self.stdout.

class core.scripts.IdentifierInputScript(_db=None)[source]

Bases: InputScript

A script that takes identifiers as command line inputs.

DATABASE_ID = 'Database ID'
classmethod arg_parser()[source]
classmethod look_up_identifiers(_db, parsed, stdin_identifier_strings, *args, **kwargs)[source]

Turn identifiers as specified on the command line into real database Identifier objects.

classmethod parse_command_line(_db=None, cmd_args=None, stdin=<_io.TextIOWrapper name='<stdin>' mode='r' encoding='utf-8'>, *args, **kwargs)[source]
classmethod parse_identifier_list(_db, identifier_type, data_source, arguments, autocreate=False)[source]

Turn a list of identifiers into a list of Identifier objects.

The list of arguments is probably derived from a command-line parser such as the one defined in IdentifierInputScript.arg_parser().

This makes it easy to identify specific identifiers on the command line. Examples:

1 2

a b c

class core.scripts.InputScript(_db=None)[source]

Bases: Script

classmethod read_stdin_lines(stdin)[source]

Read lines from a (possibly mocked, possibly empty) standard input.

class core.scripts.LaneSweeperScript(_db=None)[source]

Bases: LibraryInputScript

Do something to each lane in a library.

process_lane(lane)[source]
process_library(library)[source]
should_process_lane(lane)[source]
class core.scripts.LibraryInputScript(_db=None)[source]

Bases: InputScript

A script that operates on one or more Libraries.

classmethod arg_parser(_db, multiple_libraries=True)[source]
do_run(*args, **kwargs)[source]
classmethod look_up_libraries(_db, parsed, *args, **kwargs)[source]

Turn library names as specified on the command line into real Library objects.

classmethod parse_command_line(_db=None, cmd_args=None, *args, **kwargs)[source]
classmethod parse_library_list(_db, arguments)[source]

Turn a list of library short names into a list of Library objects.

The list of arguments is probably derived from a command-line parser such as the one defined in LibraryInputScript.arg_parser().

process_libraries(libraries)[source]
process_library(library)[source]
class core.scripts.ListCollectionMetadataIdentifiersScript(_db=None, output=None)[source]

Bases: CollectionInputScript

List the metadata identifiers for Collections in the database.

This script is helpful for accounting for and tracking collections on the metadata wrangler.

do_run(collections=None)[source]
run(cmd_args=None)[source]
class core.scripts.MirrorResourcesScript(_db=None)[source]

Bases: CollectionInputScript

Make sure that all mirrorable resources in a collection have in fact been mirrored.

MIRROR_UTILITY = <core.metadata_layer.MetaToModelUtility object>
collections_with_uploader(collections, collection_type=CollectionType.OPEN_ACCESS)[source]

Filter out collections that have no MirrorUploader.

Yield:

2-tuples (Collection, ReplacementPolicy). The ReplacementPolicy is the appropriate one for this script to use for that Collection.

classmethod derive_rights_status(license_pool, resource)[source]

Make a best guess about the rights status for the given resource.

This relies on the information having been available at one point, but having been stored in the database at a slight remove.

do_run(cmd_args=None)[source]
process_collection(collection, policy, unmirrored=None)[source]

Make sure every mirrorable resource in this collection has been mirrored.

Parameters:

unmirrored – A replacement for Hyperlink.unmirrored, for use in tests.

process_item(collection, link_obj, policy)[source]

Determine the URL that needs to be mirrored and (for books) the rationale that lets us mirror that URL. Then mirror it.

classmethod replacement_policy(mirrors)[source]

Create a ReplacementPolicy for this script that uses the given mirrors.

class core.scripts.MockStdin(*lines)[source]

Bases: object

Mock a list of identifiers passed in on standard input.

readlines()[source]
class core.scripts.OPDSImportScript(_db=None, importer_class=None, monitor_class=None, protocol=None, *args, **kwargs)[source]

Bases: CollectionInputScript

Import all books from the OPDS feed associated with a collection.

IMPORTER_CLASS

alias of OPDSImporter

MONITOR_CLASS

alias of OPDSImportMonitor

PROTOCOL = 'OPDS Import'
classmethod arg_parser()[source]
do_run(cmd_args=None)[source]
name = 'Import all books from the OPDS feed associated with a collection.'
run_monitor(collection, force=None)[source]
class core.scripts.PatronInputScript(_db=None)[source]

Bases: LibraryInputScript

A script that operates on one or more Patrons.

classmethod arg_parser(_db)[source]
do_run(*args, **kwargs)[source]
classmethod look_up_patrons(_db, parsed, stdin_patron_strings, *args, **kwargs)[source]

Turn patron identifiers as specified on the command line into real Patron objects.

classmethod parse_command_line(_db=None, cmd_args=None, stdin=<_io.TextIOWrapper name='<stdin>' mode='r' encoding='utf-8'>, *args, **kwargs)[source]
classmethod parse_patron_list(_db, library, arguments)[source]

Turn a list of patron identifiers into a list of Patron objects.

The list of arguments is probably derived from a command-line parser such as the one defined in PatronInputScript.arg_parser().

process_patron(patron)[source]
process_patrons(patrons)[source]
class core.scripts.RebuildSearchIndexScript(*args, **kwargs)[source]

Bases: RunWorkCoverageProviderScript, RemovesSearchCoverage

Completely delete the search index and recreate it.

do_run()[source]
class core.scripts.ReclassifyWorksForUncheckedSubjectsScript(_db=None)[source]

Bases: WorkClassificationScript

Reclassify all Works whose current classifications appear to depend on Subjects in the ‘unchecked’ state.

This generally means that some migration script reset those Subjects because the rules for processing them changed.

batch_size = 100
name = 'Reclassify works that use unchecked subjects.'
policy = <core.model.PresentationCalculationPolicy object>
class core.scripts.RemovesSearchCoverage[source]

Bases: object

Mix-in class for a script that might remove all coverage records for the search engine.

remove_search_coverage_records()[source]

Delete all search coverage records from the database.

Returns:

The number of records deleted.

class core.scripts.RunCollectionCoverageProviderScript(provider_class, _db=None, providers=None, **kwargs)[source]

Bases: RunCoverageProvidersScript

Run the same CoverageProvider code for all Collections that get their licenses from the appropriate place.

get_providers(_db, provider_class, **kwargs)[source]
class core.scripts.RunCollectionMonitorScript(monitor_class, _db=None, cmd_args=None, **kwargs)[source]

Bases: RunMultipleMonitorsScript, CollectionArgumentsScript

Run a CollectionMonitor on every Collection that comes through a certain protocol.

monitors(**kwargs)[source]

Find all the Monitors that need to be run.

Returns:

A list of Monitor objects.

class core.scripts.RunCoverageProviderScript(provider, _db=None, cmd_args=None, *provider_args, **provider_kwargs)[source]

Bases: IdentifierInputScript

Run a single coverage provider.

classmethod arg_parser()[source]
do_run()[source]
extract_additional_command_line_arguments()[source]

A hook method for subclasses.

Turns command-line arguments into additional keyword arguments to the CoverageProvider constructor.

By default, pass in a value used only by CoverageProvider (as opposed to WorkCoverageProvider).

classmethod parse_command_line(_db, cmd_args=None, stdin=<_io.TextIOWrapper name='<stdin>' mode='r' encoding='utf-8'>, *args, **kwargs)[source]
class core.scripts.RunCoverageProvidersScript(providers, _db=None)[source]

Bases: Script

Alternate between multiple coverage providers.

do_run()[source]
class core.scripts.RunMonitorScript(monitor, _db=None, **kwargs)[source]

Bases: Script

do_run()[source]
class core.scripts.RunMultipleMonitorsScript(_db=None, **kwargs)[source]

Bases: Script

Run a number of monitors in sequence.

Currently the Monitors are run one at a time. It should be possible to take a command-line argument that runs all the Monitors in batches, each in its own thread. Unfortunately, it’s tough to know in a given situation that this won’t overload the system.

do_run()[source]
monitors(**kwargs)[source]

Find all the Monitors that need to be run.

Returns:

A list of Monitor objects.

class core.scripts.RunReaperMonitorsScript(_db=None, **kwargs)[source]

Bases: RunMultipleMonitorsScript

Run all the monitors found in ReaperMonitor.REGISTRY

monitors(**kwargs)[source]

Find all the Monitors that need to be run.

Returns:

A list of Monitor objects.

name = 'Run all reaper monitors'
class core.scripts.RunThreadedCollectionCoverageProviderScript(provider_class, worker_size=None, _db=None, **provider_kwargs)[source]

Bases: Script

Run coverage providers in multiple threads.

DEFAULT_WORKER_SIZE = 5
get_query_and_batch_sizes(provider)[source]
run(pool=None)[source]

Runs a CollectionCoverageProvider with multiple threads and updates the timestamp accordingly.

Parameters:

pool – A DatabasePool (or other) object for use in testing environments.

class core.scripts.RunWorkCoverageProviderScript(provider_class, _db=None, providers=None, **kwargs)[source]

Bases: RunCollectionCoverageProviderScript

Run a WorkCoverageProvider on every relevant Work in the system.

get_providers(_db, provider_class, **kwargs)[source]
class core.scripts.Script(_db=None)[source]

Bases: object

classmethod arg_parser()[source]
property data_directory
load_configuration()[source]
property log
classmethod parse_command_line(_db=None, cmd_args=None)[source]
classmethod parse_time(time_string)[source]

Try to pass the given string as a time.

run()[source]
property script_name

Find or guess the name of the script.

This is either the .name of the Script object or the name of the class.

update_timestamp(timestamp_data, start_time, exception)[source]

By default scripts have no timestamp of their own.

Most scripts either work through Monitors or CoverageProviders, which have their own logic for creating timestamps, or they are designed to be run interactively from the command-line, so facts about when they last ran are not relevant.

Parameters:
  • start_time – The time the script started running.

  • exception – A stack trace for the exception, if any, that stopped the script from running.

class core.scripts.SearchIndexCoverageRemover(*args, **kwargs)[source]

Bases: TimestampScript, RemovesSearchCoverage

Script that removes search index coverage for all works.

This guarantees the SearchIndexCoverageProvider will add fresh coverage for every Work the next time it runs.

do_run()[source]
class core.scripts.ShowCollectionsScript(_db=None)[source]

Bases: Script

Show information about the collections on a server.

classmethod arg_parser()[source]
do_run(_db=None, cmd_args=None, output=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>)[source]
name = 'List the collections on this server.'
class core.scripts.ShowIntegrationsScript(_db=None)[source]

Bases: Script

Show information about the external integrations on a server.

classmethod arg_parser()[source]
do_run(_db=None, cmd_args=None, output=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>)[source]
name = 'List the external integrations on this server.'
class core.scripts.ShowLanesScript(_db=None)[source]

Bases: Script

Show information about the lanes on a server.

classmethod arg_parser()[source]
do_run(_db=None, cmd_args=None, output=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>)[source]
name = 'List the lanes on this server.'
class core.scripts.ShowLibrariesScript(_db=None)[source]

Bases: Script

Show information about the libraries on a server.

classmethod arg_parser()[source]
do_run(_db=None, cmd_args=None, output=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>)[source]
name = 'List the libraries on this server.'
class core.scripts.SubjectInputScript(_db=None)[source]

Bases: Script

A script whose command line filters the set of Subjects.

Returns:

a 2-tuple (subject type, subject filter) that can be passed into the SubjectSweepMonitor constructor.

classmethod arg_parser()[source]
class core.scripts.TimestampScript(*args, **kwargs)[source]

Bases: Script

A script that automatically records a timestamp whenever it runs.

update_timestamp(timestamp_data, start, exception)[source]

Update the appropriate Timestamp for this script.

Parameters:
  • timestamp_data – A TimestampData representing what the script itself thinks its timestamp should look like. Data will be filled in where it is missing, but it will not be modified if present.

  • start – The time at which this script believes the service started running. The script itself may change this value for its own purposes.

  • exception – The exception with which this script believes the service stopped running. The script itself may change this value for its own purposes.

class core.scripts.UpdateCustomListSizeScript(_db=None)[source]

Bases: CustomListSweeperScript

process_custom_list(custom_list)[source]
class core.scripts.UpdateLaneSizeScript(_db=None)[source]

Bases: LaneSweeperScript

process_lane(lane)[source]

Update the estimated size of a Lane.

should_process_lane(lane)[source]

We don’t want to process generic WorkLists – there’s nowhere to store the data.

class core.scripts.WhereAreMyBooksScript(_db=None, output=None, search=None)[source]

Bases: CollectionInputScript

Try to figure out why Works aren’t showing up.

This is a common problem on a new installation or when a new collection is being configured.

check_library(library)[source]

Make sure a library is properly set up to show works.

delete_cached_feeds()[source]
explain_collection(collection)[source]
out(s, *args)[source]
run(cmd_args=None)[source]
class core.scripts.WorkClassificationScript(*args, **kwargs)[source]

Bases: WorkPresentationScript

Recalculate the classification–and nothing else–for Work objects.

name = 'Recalculate the classification for works that need it.'
policy = <core.model.PresentationCalculationPolicy object>
class core.scripts.WorkConsolidationScript(force=False, batch_size=10, _db=None, cmd_args=None, stdin=<_io.TextIOWrapper name='<stdin>' mode='r' encoding='utf-8'>)[source]

Bases: WorkProcessingScript

Given an Identifier, make sure all the LicensePools for that Identifier are in Works that follow these rules:

a) For a given permanent work ID, there may be at most one Work containing open-access LicensePools.

  1. Each non-open-access LicensePool has its own individual Work.

do_run()[source]
make_query(_db, identifier_type, identifiers, data_source, log=None)[source]
name = 'Work consolidation script'
process_work(work)[source]
class core.scripts.WorkOPDSScript(*args, **kwargs)[source]

Bases: WorkPresentationScript

Recalculate the OPDS entries, MARC record, and search index entries for Work objects.

This is intended to verify that a problem has already been resolved and just needs to be propagated to these three ‘caches’.

name = 'Recalculate OPDS entries, MARC record, and search index entries for works that need it.'
policy = <core.model.PresentationCalculationPolicy object>
class core.scripts.WorkPresentationScript(*args, **kwargs)[source]

Bases: TimestampScript, WorkProcessingScript

Calculate the presentation for Work objects.

name = 'Recalculate the presentation for works that need it.'
policy = <core.model.PresentationCalculationPolicy object>
process_work(work)[source]
class core.scripts.WorkProcessingScript(force=False, batch_size=10, _db=None, cmd_args=None, stdin=<_io.TextIOWrapper name='<stdin>' mode='r' encoding='utf-8'>)[source]

Bases: IdentifierInputScript

do_run()[source]
classmethod make_query(_db, identifier_type, identifiers, data_source, log=None)[source]
name = 'Work processing script'
process_work(work)[source]

core.selftest module

Define the interfaces used by ExternalIntegration self-tests.

class core.selftest.HasSelfTests[source]

Bases: object

An object capable of verifying its own setup by running a series of self-tests.

SELF_TEST_RESULTS_SETTING = 'self_test_results'
external_integration(_db)[source]

Locate the ExternalIntegration associated with this object. The status of the self-tests will be stored as a ConfigurationSetting on this ExternalIntegration.

By default, there is no way to get from an object to its ExternalIntegration, and self-test status will not be stored.

classmethod prior_test_results(_db, constructor_method=None, *args, **kwargs)[source]

Retrieve the last set of test results from the database.

The arguments here are the same as the arguments to run_self_tests.

classmethod run_self_tests(_db, constructor_method=None, *args, **kwargs)[source]

Instantiate this class and call _run_self_tests on it.

Parameters:
  • _db – A database connection. Will be passed into _run_self_tests. This connection may need to be used again in args, if the constructor needs it.

  • constructor_method – Method to use to instantiate the class, if different from the default constructor.

  • args – Positional arguments to pass into the constructor.

  • kwargs – Keyword arguments to pass into the constructor.

Returns:

A 2-tuple (results_dict, results_list) results_dict is a JSON-serializable dictionary describing the results of the self-test. results_list is a list of SelfTestResult objects.

run_test(name, method, *args, **kwargs)[source]

Run a test method, record any exception that happens, and keep track of how long the test takes to run.

Parameters:
  • name – The name of the test to be run.

  • method – A method to call to run the test.

  • args – Positional arguments to method.

  • kwargs – Keyword arguments to method.

Returns:

A filled-in SelfTestResult.

classmethod test_failure(name, message, debug_message=None)[source]

Create a SelfTestResult for a known failure.

This is useful when you can’t even get the data necessary to run a test method.

class core.selftest.SelfTestResult(name)[source]

Bases: object

The result of running a single self-test.

HasSelfTest.run_self_tests() returns a list of these

property debug_message

The debug message associated with the Exception, if any.

property duration

How long the test took to run.

property to_dict

Convert this SelfTestResult to a dictionary for use in JSON serialization.

core.testing module

class core.testing.AlwaysSuccessfulBibliographicCoverageProvider(collection, **kwargs)[source]

Bases: MockCoverageProvider, BibliographicCoverageProvider

A BibliographicCoverageProvider that does nothing and is always successful.

Note that this only works if you’ve put a working Edition and LicensePool in place beforehand. Otherwise the process will fail during handle_success().

SERVICE_NAME = 'Always successful (bibliographic)'
process_item(identifier)[source]

Do the work necessary to give coverage to one specific item.

Since this is where the actual work happens, this is not implemented in IdentifierCoverageProvider or WorkCoverageProvider, and must be handled in a subclass.

class core.testing.AlwaysSuccessfulCollectionCoverageProvider(collection, **kwargs)[source]

Bases: MockCoverageProvider, CollectionCoverageProvider

A CollectionCoverageProvider that does nothing and always succeeds.

SERVICE_NAME = 'Always successful (collection)'
process_item(item)[source]

Do the work necessary to give coverage to one specific item.

Since this is where the actual work happens, this is not implemented in IdentifierCoverageProvider or WorkCoverageProvider, and must be handled in a subclass.

class core.testing.AlwaysSuccessfulCoverageProvider(*args, **kwargs)[source]

Bases: InstrumentedCoverageProvider

A CoverageProvider that does nothing and always succeeds.

SERVICE_NAME = 'Always successful'
class core.testing.AlwaysSuccessfulWorkCoverageProvider(_db, *args, **kwargs)[source]

Bases: InstrumentedWorkCoverageProvider

A WorkCoverageProvider that does nothing and always succeeds.

SERVICE_NAME = 'Always successful (works)'
class core.testing.BrokenBibliographicCoverageProvider(*args, **kwargs)[source]

Bases: BrokenCoverageProvider, BibliographicCoverageProvider

SERVICE_NAME = 'Broken (bibliographic)'
class core.testing.BrokenCoverageProvider(*args, **kwargs)[source]

Bases: InstrumentedCoverageProvider

SERVICE_NAME = 'Broken'
process_item(item)[source]

Do the work necessary to give coverage to one specific item.

Since this is where the actual work happens, this is not implemented in IdentifierCoverageProvider or WorkCoverageProvider, and must be handled in a subclass.

class core.testing.DatabaseTest[source]

Bases: object

connection = None
engine = None
classmethod get_database_connection()[source]
classmethod make_default_library(_db)[source]

Ensure that the default library exists in the given database.

This can be called by code intended for use in testing but not actually within a DatabaseTest subclass.

classmethod print_database_class(db_connection)[source]

Prints to the console the entire contents of the database, as the unit test sees it. Exists because unit tests don’t persist db information, they create a memory representation of the db state, and then roll the unit test-derived transactions back. So we cannot see what’s going on by going into postgres and running selects. This is the in-test alternative to going into postgres.

Can be called from model and metadata classes as well as tests.

NOTE: The purpose of this method is for debugging. Be careful of leaving it in code and potentially outputting vast tracts of data into your output stream on production.

Call like this:

set_trace()
from testing import (l=
    DatabaseTest,
)
_db = Session.object_session(self)
DatabaseTest.print_database_class(_db)

TODO: remove before prod
print_database_instance()[source]

Calls the class method that examines the current state of the database model (whether it’s been committed or not).

NOTE: If you set_trace, and hit “continue”, you’ll start seeing console output right away, without waiting for the whole test to run and the standard output section to display. You can also use nosetest –nocapture.

I use:

def test_name(self):
    [code...]
    set_trace()
    self.print_database_instance()  # TODO: remove before prod
    [code...]
sample_cover_path(name)[source]

The path to the sample cover with the given filename.

sample_cover_representation(name)[source]

A Representation of the sample cover with the given filename.

search_mock(request)[source]
classmethod setup_class()[source]
setup_method()[source]
shortDescription()[source]
classmethod teardown_class()[source]
teardown_method()[source]
time_eq(a, b)[source]

Assert that two times are approximately the same – within 2 seconds.

class core.testing.DummyCanonicalizeLookupResponse[source]

Bases: object

classmethod failure()[source]
classmethod success(result)[source]
class core.testing.DummyHTTPClient[source]

Bases: object

do_get(url, *args, **kwargs)[source]
do_post(url, data, *wargs, **kwargs)[source]
queue_requests_response(response_code, media_type='text/html', other_headers=None, content='')[source]

Queue a response of the type produced by HTTP.get_with_timeout.

queue_response(response_code, media_type='text/html', other_headers=None, content='')[source]

Queue a response of the type produced by Representation.simple_http_get.

class core.testing.DummyMetadataClient[source]

Bases: object

canonicalize_author_name(primary_identifier, display_author)[source]
class core.testing.EndToEndSearchTest[source]

Bases: ExternalSearchTest

Subclasses of this class set up real works in a real search index and run searches against it.

populate_works()[source]
setup_method()[source]
class core.testing.ExternalSearchTest[source]

Bases: DatabaseTest

These tests require elasticsearch to be running locally. If it’s not, or there’s an error creating the index, the tests will pass without doing anything.

Tests for elasticsearch are useful for ensuring that we haven’t accidentally broken a type of search by changing analyzers or queries, but search needs to be tested manually to ensure that it works well overall, with a realistic index.

SIMPLIFIED_TEST_ELASTICSEARCH = 'http://localhost:9200'
default_work(*args, **kwargs)[source]

Convenience method to create a work with a license pool in the default collection.

pytestmark = [Mark(name='elasticsearch', args=(), kwargs={})]
setup_index(new_index)[source]

Create an index and register it to be destroyed during teardown.

setup_method()[source]
teardown_method()[source]
class core.testing.InstrumentedCoverageProvider(*args, **kwargs)[source]

Bases: MockCoverageProvider, IdentifierCoverageProvider

A CoverageProvider that keeps track of every item it tried to cover.

process_item(item)[source]

Do the work necessary to give coverage to one specific item.

Since this is where the actual work happens, this is not implemented in IdentifierCoverageProvider or WorkCoverageProvider, and must be handled in a subclass.

class core.testing.InstrumentedWorkCoverageProvider(_db, *args, **kwargs)[source]

Bases: MockCoverageProvider, WorkCoverageProvider

A WorkCoverageProvider that keeps track of every item it tried to cover.

process_item(item)[source]

Do the work necessary to give coverage to one specific item.

Since this is where the actual work happens, this is not implemented in IdentifierCoverageProvider or WorkCoverageProvider, and must be handled in a subclass.

class core.testing.LogCaptureHandler(logger, *args, **kwargs)[source]

Bases: Handler

A logging.Handler context manager that captures the messages of emitted log records in the context of the specified logger.

LEVEL_NAMES = ['critical', 'error', 'warning', 'info', 'debug', 'notset']
emit(record)[source]

Do whatever it takes to actually log the specified logging record.

This version is intended to be implemented by subclasses and so raises a NotImplementedError.

reset()[source]

Empty the message accumulators.

class core.testing.MockCoverageProvider[source]

Bases: object

Mixin class for mock CoverageProviders that defines common constants.

DATA_SOURCE_NAME = 'Gutenberg'
INPUT_IDENTIFIER_TYPES = None
PROTOCOL = 'OPDS Import'
SERVICE_NAME = 'Generic mock CoverageProvider'
class core.testing.MockRequestsRequest(url, method='GET', headers=None)[source]

Bases: object

A mock object that simulates an HTTP request from the requests library.

class core.testing.MockRequestsResponse(status_code, headers={}, content=None, url=None, request=None)[source]

Bases: object

A mock object that simulates an HTTP response from the requests library.

json()[source]
raise_for_status()[source]

Null implementation of raise_for_status, a method implemented by real requests Response objects.

property text
class core.testing.NeverSuccessfulBibliographicCoverageProvider(collection, **kwargs)[source]

Bases: MockCoverageProvider, BibliographicCoverageProvider

Simulates a BibliographicCoverageProvider that’s never successful.

SERVICE_NAME = 'Never successful (bibliographic)'
process_item(identifier)[source]

Do the work necessary to give coverage to one specific item.

Since this is where the actual work happens, this is not implemented in IdentifierCoverageProvider or WorkCoverageProvider, and must be handled in a subclass.

class core.testing.NeverSuccessfulCoverageProvider(*args, **kwargs)[source]

Bases: InstrumentedCoverageProvider

A CoverageProvider that does nothing and always fails.

SERVICE_NAME = 'Never successful'
process_item(item)[source]

Do the work necessary to give coverage to one specific item.

Since this is where the actual work happens, this is not implemented in IdentifierCoverageProvider or WorkCoverageProvider, and must be handled in a subclass.

class core.testing.NeverSuccessfulWorkCoverageProvider(_db, *args, **kwargs)[source]

Bases: InstrumentedWorkCoverageProvider

SERVICE_NAME = 'Never successful (works)'
process_item(item)[source]

Do the work necessary to give coverage to one specific item.

Since this is where the actual work happens, this is not implemented in IdentifierCoverageProvider or WorkCoverageProvider, and must be handled in a subclass.

class core.testing.SearchClientForTesting(_db, url=None, works_index=None, test_search_term=None, in_testing=False, mapping=None)[source]

Bases: ExternalSearchIndex

When creating an index, limit it to a single shard and disable replicas.

This makes search results more predictable.

setup_index(new_index=None)[source]

Create the search index with appropriate mapping.

This will destroy the search index, and all works will need to be indexed again. In production, don’t use this on an existing index. Use it to create a new index, then change the alias to point to the new index.

class core.testing.TaskIgnoringCoverageProvider(*args, **kwargs)[source]

Bases: InstrumentedCoverageProvider

A coverage provider that ignores all work given to it.

SERVICE_NAME = 'I ignore all work.'
process_batch(batch)[source]

Do what it takes to give coverage records to a batch of items.

Returns:

A mixed list of coverage records and CoverageFailures.

class core.testing.TransientFailureCoverageProvider(*args, **kwargs)[source]

Bases: InstrumentedCoverageProvider

SERVICE_NAME = 'Never successful (transient)'
process_item(item)[source]

Do the work necessary to give coverage to one specific item.

Since this is where the actual work happens, this is not implemented in IdentifierCoverageProvider or WorkCoverageProvider, and must be handled in a subclass.

class core.testing.TransientFailureWorkCoverageProvider(_db, *args, **kwargs)[source]

Bases: InstrumentedWorkCoverageProvider

SERVICE_NAME = 'Never successful (transient, works)'
process_item(item)[source]

Do the work necessary to give coverage to one specific item.

Since this is where the actual work happens, this is not implemented in IdentifierCoverageProvider or WorkCoverageProvider, and must be handled in a subclass.

core.testing.pytest_configure(config)[source]
core.testing.session_fixture()[source]

core.user_profile module

class core.user_profile.MockProfileStorage(read_only_settings=None, writable_settings=None)[source]

Bases: ProfileStorage

A profile storage object for use in tests.

Keeps information in in-memory dictionaries rather than in a database.

property profile_document

Create a Profile document representing the current state of the user’s profile.

Returns:

A dictionary that can be serialized as JSON.

update(new_values, profile_document)[source]

(Try to) change the user’s profile so it looks like the provided Profile document.

property writable_setting_names

Return the subset of fields that are considered writable.

class core.user_profile.ProfileController(storage)[source]

Bases: object

Implement the User Profile Management Protocol.

https://github.com/NYPL-Simplified/Simplified/wiki/User-Profile-Management-Protocol

MEDIA_TYPE = 'vnd.librarysimplified/user-profile+json'
get()[source]

Turn the storage object into a Profile document and send out its JSON-based representation.

Parameters:

return – A ProblemDetail if there is a problem; otherwise, a 3-tuple (entity-body, response code, headers)

put(headers, body)[source]

Update the profile storage object with new settings from a Profile document sent with a PUT request.

Parameters:

return – A ProblemDetail if there is a problem; otherwise, a 3-tuple (response code, media type, entity-body)

class core.user_profile.ProfileStorage[source]

Bases: object

An abstract class defining a specific user’s profile.

Subclasses should get profile information from somewhere specific, e.g. a database row.

An instance of this class is responsible for one specific user’s profile, not the set of all profiles.

AUTHORIZATION_EXPIRES = 'simplified:authorization_expires'
AUTHORIZATION_IDENTIFIER = 'simplified:authorization_identifier'
FINES = 'simplified:fines'
NS = 'simplified:'
SETTINGS_KEY = 'settings'
SYNCHRONIZE_ANNOTATIONS = 'simplified:synchronize_annotations'
property profile_document

Create a Profile document representing the current state of the user’s profile.

Returns:

A dictionary that can be serialized as JSON.

update(new_values, profile_document)[source]

(Try to) change the user’s profile so it looks like the provided Profile document.

Parameters:
  • new_values – A dictionary of settings that the client wants to change.

  • profile_document – The full Profile document as provided by the client. Should not be necessary, but provided in case it’s useful.

Raises:

Exception – If there’s a problem making the user’s profile look like the provided Profile document.

property writable_setting_names

Return the subset of settings that are considered writable.

An attempt to modify a setting that’s not in this list will fail before update() is called.

Returns:

An iterable.

Module contents