DMAonline Development

As an Analyst & Programmer at Lancaster University Library I get to work on wide range of projects from green-field new developments through to maintaining existing code bases. DMA Online is a Jisc research data spring funded project to provide a dashboard to display all relevant information related to research data management.

The following is a blog post on why we moved from using Lua for the importing and API to using Ruby and Ruby on Rails.

Originaly posted on the DMA Online blog.

With more feature requests and the interest from multiple institutions in DMAOnline, the decision was taken to complete a review on the design and architecture of the system and to decide whether this was scalable and easy for the development team to maintain.

After multiple discussions on the pros and cons of staying with Lua and Openresty/Nginx or moving to alternatives for the backend service, it was decided to switch to a full application framework that would support the current use cases and needs for DMAOnline while allowing easy extension and further development. Within the development team for DMAOnline, there was experience with Python and Django as well as Ruby on Rails. As more of us are developing using Ruby on Rails, we decided to move from Lua/Openresty to Ruby on Rails.

Why Ruby on Rails

With its convention over configuration paradigm, Ruby on Rails provides an extensible MVC framework on which to build the backend of DMAOnline. There are also numerous gems (plugins) available for Ruby on Rails that can quickly provide ready to use blocks of code or functionality that have been thoroughly tested. Within DMAOnline we are using the following gems to provides some of the core functionality needed:

Devise

Devise provides a prebuilt authentication framework for multiple user type authentication with all the necessary precautions around security of login and password resetting as well as handling any session management needed.

Symmetric Encryption

As DMAOnline will be interacting with many different systems, some of which may require authentication details, this gem allows us to store these authentication details securely, encrypted within our database and only decryptable when the correct encryption key is stored on disk.

Papertrail

As the name might suggest, papertrail allows us to version key areas of the data within the application. Currently, this only extends to versioning the configuration of an institution with how they are using DMAOnline. It also provides a log of what has changed with each version of the data model.

While switching to Ruby on Rails will give us the ability to add functionality through the use of gems it will also allow us to provide a more extensible and Object Oriented application base in which to add new functionality. This can be seen clearly with how we are developing the ingest architecture for DMA Online. We have created a generic ingester for a critical area of data such as organisation structure which provides standardised methods for adding to the database and values that need to be cached so that specific implementation cases don’t need to worry about how this is done.

class OrganisationIngester
    def add_organisation_unit
    end
    
    def link_child_to_parent
    end
    
    def cache_uuid_mapping
    end
    
    def get_system_uuid_mapping
    end
end

The pseudo code of how extending a generic ingester will allow more ingesters to be written at a later date supporting a variety of different systems.

class JSONOrganisationIngester < OrganisationIngester

    def ingest
    
        organisation_units = parse_file JSON_FILE_PATH
        
        organisation_units.each do |org_unit|
            add_organisation_unit org_unit
        end
        
        organisation_units.each do |org_unit|
            link_child_to_parent org_unit["uuid"], org_unit["parent"]["uuid"]
        end
    
    end

end

Development Workflow

When changing from Lua to Ruby on Rails, we decided it would be a good time to embrace as a team a set of good development practices including Test Driven Development (TDD), Continuous Integration and Deployment (CI/CD) as well as a more structured git workflow within our version control.

Within the development team at Lancaster University Library we use a range of tools/products to support our development process including:

Basecamp

Basecamp acts as our high-level project management tool. We try to keep records of all development meetings in here from whiteboard drawings to documents containing use cases and other relevant information. Coincidently the CTO of Basecamp David Heinemeier Hansson is also the creator of Ruby on Rails after extracting it from Basecamp in 2003.

Youtrack

Youtrack is a JetBrains product for issue tracking and agile project management. We use it to keep track of all the user stories we are working on as well as feature requests and bugs within DMAOnline. This allows us to split up the development into keep areas or ‘sub-systems’ and then prioritise the development of these tasks on our kanban board.

Github

Our hosted version control platform of choice, with its immense standing in the open source community it makes complete sense to use Github for hosting the DMAOnline code. We currently use what is referred to as Github Flow. Coming from Git Flow, it is a workflow for working with Github or any hosted version control platform for maintaining a codebase amongst multiple developers. We have not yet used some of the new features on Github, such as projects (released at Github universe in September 2016) and the long-standing features of issues and Github pages. However, we are looking into how we can use them to help with the development of DMAOnline and keeping people informed of the development progress.

Semaphore CI

Semaphore is a hosted Continuous Integration and Continuous Deployment platform providing a simple to use interface as well as a comprehensive build/test environment. Currently, we are using a standard virtual machine deployment infrastructure and the environment from Semaphore CI is helpful in this by providing a clean image on every build with the ability to link in configuration files that are not stored in version control. The build environment provided has everything we need available including multiple versions of Ruby, PostgreSQL (our database of choice), redis (used for caching) amongst other applications and programming languages.

What’s next for DMA Online Development

Continuing the re-write from Lua to Ruby on Rails, we hope to have this complete somewhere between the middle and end of October in the first case providing similar functionality to what is already within DMAOnline. After this, we are looking at making available integrations with multiple external systems including Pure and DMPOnline, so we are not just ingesting from JSON or file data.