API Banking – 10 Bank Developer Portals

In no particular order …

UK Market (Open Banking)

[1] HSBC
https://developer.hsbc.com/

[2] RBS #BankOfAPIs Developer Portal
https://developer.bluebank.io/

[3] Barclays
https://developer.barclays.com/

[4] Lloyds Bank Developer Portal
https://developer.lloydsbank.com/

[5] Halifax
https://developer.halifax.co.uk/opendata-v2

European Market (PSD2)

[6] Nordea Developer Portal
https://developer.nordeaopenbanking.com/app/docs

[7] ING Developer Portal
https://developer.ing.com/openbanking/

[8] Deutsche Bank Developer Portal
https://developer.db.com/#/

US Market

[9] Wells Fargo Developer Portal
https://developer.wellsfargo.com/

[10] Citi Developer Hub
https://sandbox.developerhub.citi.com/

Posted in API | Tagged , , | Leave a comment

REST API Design – A Beginner’s Reading List

There’s no better place to start than Steve Yegge’s post, where he dicusses the Jeff Bezos memo that kicked off the service architecture revolution at Amazon:

The RESTful cookbook is a your next stop – an easy to digest of many of the key topics:
http://restcookbook.com/

Then you should read this classic worked example – ‘How to GET a Cup of Coffee’:
https://www.infoq.com/articles/webber-rest-workflow

(If you read the comments section of the article, you’ll get a useful taste of the kind of design discussions that come up in the field)

The REST API Design Handbook is a good, quick read, once you’re starting to get more confidence:

It’s a little dated now, but the RESTful Web Services Cookbook is a great resource once you start needing deeper coverage on topics such as asynchronous calls and versioning:

Lastly, if you want to see a standard for REST API design published by an organisation that’s got some serious experience, check out this public document from Zalando:
https://opensource.zalando.com/restful-api-guidelines/

It is long, but you’ll find yourself agreeing with most of it.

Posted in API, Enterprise Integration | Tagged , | Leave a comment

Bluff Your Way in Enterprise Architecture

Being an architect is hard work. Given how small the population of people willing to do hard work is, it might be a little mysterious to you how many people manage to wangle ‘architect’ into their job titles*, but how few know what they are talking about.

If so, help is at hand – use this handy list of bluffing points to enable you or your team (or even that guy from Pret who brought the sandwiches in today) to play the role of enterprise architect while HR complete the twelve month recruitment process.

Bluff 1: I’m just not sure this solution will scale

Unanswerable, since any attempt to defend the solution can be simply rebutted by you declaring that wasn’t what *you* meant by scale. See below for worked example:

Victim: “So our document storage system can store upto 2 petabytes of data on a single node”

You: “I’m just not sure this solution will scale”

Victim: “Well we can easily scale out to 64 nodes without needing any additional configuration”

You: “I’m sure the trivial case is fine, but I was more concerned about real world scenarios; we are looking at a zetabyte load”

Bluff 2: What’s your use case?

 Ostensibly an appeal to practicality, this is actually a brilliant way to patronise someone by insinuating they don’t talk to their customers and are obsessed with technology for technology’s sake. The use of Agile lingo makes it particularly hard to counter, since no-one wants to argue against Agile, right?

On very rare occasions you will meet someone who is actually half competent at Agile, in which case your fallback position is to bamboozle them with semantics, as follows:

You: “But I’m just not clear what our use case is here.”

Victim: “Well, as a customer of Big Corp, I want to log into my mobile app, so that I can check my orders”

You: “Hmmm. But are we sure that’s really a use case? To me that sounds more like an epic/spike/…/etc”

Bluff 3: What does the Technical Traffic Warden Committee say about this?

Every large organisation attracts a modest but determined band of people who love to proceduralise and standardise the joy out of anything they can get their hands on, including the very thing that got most of us into computing in the first place, which is playing with cool new stuff.

The TTWC won’t actually be named as such, but their ostensible function (preventing technical sprawl), and their actual effect ( keeping their organisation a decade behind the rest of the industry) will feel very familiar to anyone who has gotten a parking ticket.

Dobbing your victim into the TTWC will embroil them in months of red tape while they try to explain to a bunch of fifty-somethings what a graph database is. This will definitely make you enemies, so use with caution.

Bluff 4: Sorry, I’ve got to jump into another meeting now …

Never, ever, under any circumstances, decline a meeting. If you play this one right, you should have at least two different meetings happening in your calendar at any point in the working week. To get the full effect, you should share your calendar, so anyone who actually gets a bit of your time is fully aware how lucky they are, and how you might have to dash off at a moment’s notice (particularly if you’ve got something more interesting do, or if people start talking about deliverables). Also, if you fail to turn up to one meeting, the attendees will naturally assume you are at the other meeting. Not for the beginner, but can be devastatingly effective.

Bluff 5: Are we sure we’ve correctly separated the [control/management/data] plane from the [management/data/control] plane here? (Delete as applicable)

This is a ninja move guaranteed to bamboozle most people in the room in any meeting, sending them into a mental tailspin of anxiety as they try to figure out what you are talking about.

In computing this gets fantastically complicated, as people do crazy stuff like insist on separate physical networks for these things, and then wonder why they can’t afford a test environment or release more than once a year. Great for you, as you can simply make things more complicated if anyone looks like understanding:

You: “I’m just not sure we’ve correctly separated the management plane from the data plane here.”

Victim: “Well the admin process runs on a separate port secured by TLS”

You: “Hmmm. I don’t suppose you support a different LDAP domain for admin users, do you? That’s the Big Corp standard.”

Bluff 6: I think we need to run this one past Security

Not quite as bad as throwing someone to the TTWC (see above), but this will have similar effect, as the victim attempts to navigate whether they should be talking to the security architecture team or the threat prevention team. All of these teams will be small and incredibly overworked, to the point that the only way of speaking with them will be to actually accost them as they run from the smoking remains of one emergency to the next.

Security teams are usually masters at making sure the responsibility and effort for securing systems remains with the people building or buying them, so this is a good way of giving a project a mountain of invisible work that they are pretty much obliged to do if they want to get into production.

Bluff 7: I think there’s quite a lot of overlap here with the XYZ initative. I think we need to set up a meeting with them before we go any further with this.

Any large organisation will have multiple projects on the go, all competing for the same limited nutrients of money and management attention. There will definitely be something else out there which resembles or overlaps in some superficially plausible way with the project you’re looking at, so why not “help” by getting them together in the same room to pretend to be interested in each other? You can even rerun Game of Thrones for your private entertainment by secretly christening one project the Lannisters, and the other project the Starks.

Bluff 8: Why don’t we just try this out as an experiment before we commit?

The idea here is that to spend a few weeks/months kicking the tyres, before you commit millions, so you can walk away if it doesn’t work out.

Sounds reasonable, doesn’t it?

Despite sounding reasonable, it turns out most organisations are about as good at experimenting with projects as you and I would be with experimenting with crack. Experiments turn into POCs, which turn into pilots, which turn into Phase 1, which finally turns into Security being asked to sign off for go live on a solution they’ve never heard of.

So, yeah, you’re really only saying this to be the one who said it, with a side order of plausible deniability if it all goes Pete Tong.

Bluff 9: The underlying, guaranteed solution to all technology problems

I’ve used this one to justify multiple board position for two decades, it absolutely cannot fail. All you need to do is – ah, shoot, I’ve got to go to a meeting with the CCB about a firewall change, I’ll drop it to you in an email I promise …

*  I’m talking to you, ‘DevOps Architect’, seriously?

Posted in Agile, Architecture, Humour | Comments Off on Bluff Your Way in Enterprise Architecture

Infrastructure as Code – Key Terms

There’s an excellent introductory series on Terraform over at Gruntwork, and apart from anything else it has a very clear introduction to what the different tools in this space do.

I recommend the blog, but here’s a quick summary of some key terms:

Provisioning Configuring
Provision the servers. Install and mange software on existing servers.
Orchestration. Configuration Management.
Terraform, CloudFormation. Chef, Puppet, Ansible.
Talk to the datacentre (e.g. vSphere, OpenStack) Talk to the server (e.g. Linux, Windows)
Posted in Automation, Virtualisation | Tagged , | Comments Off on Infrastructure as Code – Key Terms

How To: Anonymise swap trade data for your project team

I was recently asked for some “production like” OTC swaps coming from Calypso so that a development partner could test their proof of concept project. I needed to provide trade data as well as referential data for both product and account look ups to support the testing. The following shows you some of the techniques that I employed to anonymise the swap data whilst still enabling the vendor to use it to prove their system.

Anonymise Swap

Technique: HASHING
Purpose: To change the value of data items like account IDs
Effect: Breaks the link between the data to be released and the original production data
Applied To: Account, Product, Index and Trade IDs
Comments:
We developed an algorithm that could change IDs in the data. We needed to maintain the integrity of lookups from the trade to the referential data. Therefore, our algorithm scrambled the data in the same way for both the trade and referential datasets. This scrambling removed the ability to link the data back to the production system.

Technique: DATE SLIDING
Purpose: To slide all the dates in the trade data forward/backward by a consistent value
Effect: Changes the dates on the trade whilst still maintaining the integrity of the dates
Applied To: Trade, As Of, Execution, Cleared, Effective. Termination and Payment dates
Comments:
We developed algorithm based on a couple of trade attributes. The first was used to determine the offset value to be applied to the dates in the trade data. The second was used to determine the direction (forward or backward). This was particularly effective as it applied different slides to each trade.

Technique: PERTURBATION
Purpose: To adjust the economic values in a dataset so they no longer match the original
Effect: Changes the economic values by applying aggregates across various ranges
Applied To: Notional, Fixed Rate, Premium
Comments:
We analysed the economic data in the dataset and applied averages across bands of data. This means that the dataset as a whole is still mathematically intact but individual economic values on trades have been adjusted.

Technique: MASKING
Purpose: To prevent test within the data from providing information to the consumer
Effect: Replaces text strings with “*” characters
Applied To: Party names, country of residence, contact information, trader names
Comments:
Simple masking was implemented on this data. As an additional security step, we ensured that all the masking was the same length. This would prevent someone for trying to deduce client names from the length of the mask.

Finally, we also applied an additional technique to the data in order to apply “noise” to the data by adding additional entries to the dataset.

Technique: K-Anonymisation
Purpose: To distort the number of entries in the dataset for a set criteria
Effect: Ensures that there will always be at least “K” occurrences of trades matching the criteria
Applied To: Additional trades across the dataset
Comments:
We were concerned that it might be possible to narrow down trades for a specific counterparty, In the scenario where the consumer of the data knew that a single trade had taken place with the counterparty for a specific value it could be possible to identify this trade. In order to obfuscate the dataset we developed an algorithm that would ensure that there would always be “K” entries for the specified criteria.

Postscript
I’ve created a spreadsheet demonstrating some of these techniques, which you can download via the form below.

Posted in FpML, ISDA, Test Data, Testing | Tagged , , , , , | Comments Off on How To: Anonymise swap trade data for your project team

Automated FpML Message Testing with JMeter

JMeter_V0.1

One of the ingredients of a successful messaging project is strong testing. However, the fluid nature of messaging projects means iteration after iteration of system releases. This presents a challenge for the testers and they need to run the tests and verify the results over and over again. Given the complex routing, functional and regression testing requirements in messaging projects, you will need an automated process. Without it you will struggle to prove that your release is fit for purpose in a timely manner. We have found that the Apache Foundation’s JMeter provides a perfect solution.

The Apache Foundation’s JMeter solution provides a way to automate testing as well as check the results. Although designed to flex systems to provide load testing and monitoring services, the software can also orchestrate tests – which is perfect for the testing of messaging systems. Additionally, JMeter doesn’t need a full Developer software setup. It doesn’t require an install – simply dropping the JMeter files on your machine is enough to get it up and running.

.

The following article details how we used JMeter to orchestrate the testing of a messaging system.

.

Before we started

Before we rushed into building out tests for the messaging system, we needed to think a few things through:

  • Strategy: What would prove that the system worked?
  • Test Pack: What would our test inputs look like?
  • Orchestration: How would we present the test inputs and check the outputs?
  • Visibility: How would we know which were our tests in a busy environment?
  • Control: How could we maintain strict version control of our tests?

Strategy

We designed our tests using the black box testing strategy. This means ignoring the inner workings of the messaging system and looking at the inputs and outputs from it. In our messaging system, we concentrated on a single target system. There are numerous other targets that are fed by our messaging system but we chose to build our test pack around this particular system.

JMeter_Test_Boundary_V0.1

Fig 1.1 – Black Box testing strategy

[A point of note. JMeter is sufficiently flexible to support us moving to white box testing in later iterations.]

.

Test Pack

The test data for our system would consist of FpML messages. We won’t cover the process of how we determined the content for these messages here. However, its important to understand how we stored these. We decided to use individual CSV files to contain the messages for each functional test that we required. This resulted in us having approximately ten CSV files, each holding numerous FpML messages. We stored these in our version control system.

.

Orchestration

This is where JMeter came into its own. We made use of the following functionality within the tool in order to support our testing.

  • HTTP Header Manager: This allowed us to connect to the input queue via the Rabbit MQ http web service
  • JDBC Connection: This allowed us to connect to target Oracle database
  • CSV Data Set Config: This allowed us to read in our CSV test packs and load the messages
  • Constant Timer: This allowed us to build in a delay between posting and checking the results
  • BeanShell Sampler: This allowed us to get creative with generating IDs and counting the rows on our CSV test packs
  • Loop Controller: This allowed us to loop through our message push for each row on our CSV test packs
  • JDBC Request: This allowed us to run SQL queries against the target database to pull our results back
  • Response Assertion: This allowed us to compare the results returned to our expected results
  • View Results Tree: This allowed us to see the results of tests

That’s quiet a lot of functionality all contained within JMeter that we could call on out-of-the-box. JMeter allowed us to use all of these and string them together in order to meet our requirements. They are all added to the test plan into the tree structure and configured via the UI. Our Business Analyst was able to build all this without a developer spec machine.

.

Visibility

Our test environment had a lot of activity taking place within it. In order to ensure that we could see our tests, we decided to generate a “run number” for each test run and prefix all our trade Ids with that number. We could then quickly see our trades and this also supported pulling the results for this test only from the target database.

JMeter provided the built in User Defined Variable functionality, which allowed us to automate this run number and to set a run time variable to hold the value. It was then straight forward to adjust our test packs to include this variable.

.

Control

The outstanding feature of JMeter is that it can easily pull in version controlled files. This ensured that our test packs could be checked into version control and become part of our project artifacts. The JMeter test plan itself can also be saved as a .jmx file and stored in version control. This is a critical feature when working in such fluid development projects.

.

When you put it all together, what does it look like

JMeter_Automated_Testing_V0.1

Fig 1.2 – Our  JMeter Testing Framework

.

.Summary

JMeter allowed us to quickly build out an automated testing function for our BAs to use. We were able to save the orchestration as well as our test data in our version control system. Moving from a slow manual process utilising multiple tools to an automated, self contained and self checking testing tool was critical to the project success. It is also possible to add JMeter to your Jenkins automated build so these tests can be run with every build in the future.

.

If you want to know more about how we did this and what we could do for you and your projects, then feel free to get in touch.

 .

Posted in Automation, FpML, JMeter, Orchestration, Regression, Smoke Testing, STP, Test Automation, Testing, Trade Flow, Uncategorized | Tagged , , , , , , , , | Comments Off on Automated FpML Message Testing with JMeter

MariaDB CONNECT – Avoiding the pitfalls!

mariaDB

There will come a time when you need to make data available to your mariaDB application from other database management systems. The CONNECT functionality allows you to do this. This article will cover how to use it to access remote data and some of the challenges and pitfalls you may encounter.

In one of our recent projects, we needed to calculate some count statistics from two Oracle 11g database tables and store the results in our mariaDB 10.0.22 database. We were dealing with approximately 2 million rows on each of the Oracle tables and, as we were calculating set theory counts, we needed to compare the keys on both tables. The tables were indexed correctly and performance within Oracle was really good.

In order to access the Oracle tables we need to set CONNECT up. Having rushed through the CONNECT documentation, we set up two CONNECT tables in our mariaDB database, one for each of the remote Oracle tables.

The mariadb create table statements looked a bit  like this:

CREATE TABLE CONNECT_Remote_Data_TableA

ENGINE=CONNECT

TABLE_TYPE=ODBC

TABNAME=TableA

CONNECTION=’Driver={Oracle ODBC driver};Server=://xxx.xxx.xxx.xxx:1521/ORCL;UID=USERID;PWD=PASSWORD;’

 

CREATE TABLE CONNECT_Remote_Data_TableB

ENGINE=CONNECT TABLE_TYPE=ODBC

TABNAME=TableB

CONNECTION=’Driver={Oracle ODBC driver};Server=://xxx.xxx.xxx.xxx:1521/ORCL;UID=USERID;PWD=PASSWORD;’

When we ran these, the result was successful and the two tables were created. A quick test via “Select * from CONNECT_Remote_Data_TableA” proved that data was indeed flowing from Oracle to mariaDB.

We built our queries in mariaDB, referring to the CONNECT tables and started our unit testing. The results were good and we could insert the data returned from them into a mariaDB table. CONNECT was a success and we could now push on with the rest of the development, having built and tested this functionality.

Everything went well until we started to ramp up the volume in the Oracle tables. Then we witnessed an alarming degradation in performance that got worse as we added more and more data. At first we struggled to understand what the problem was – the tables were indexed after all and, therefore, access should be really quick. It was only when we started to think through what  CONNECT table actually was and did some more reading that we found the problem. The solution was based around where the actual SQL Query was being executed.

Here is a representation of what we had built:

maria_connect_pic1

In this configuration, our SQL query was running in mariaDB and drawing data from the Oracle tables. MariaDB inserted the result into the results table but it was very slow. Out of interest, we took the SQL query, converted it to Oracle PL/SQL and ran it in Oracle. The results were lightening quick as you’d expect them to be as the tables were correctly indexed. So, the problem was related to where the SQL ran:

  • In mariaDB – very slow
  • In Oracle – very fast

What’s the usual solution to make a slow query run quickly? Indexing. So we looked at that. In our rush to get this up and running, we had missed the fact that ODBC CONNECT tables cannot be indexed. In effect, all we had created was a conduit or “pipe” to the data which arrived in a stream of unindexed rows that mariaDB then had to work heroically to produce our results from.

So how could we make use of the Oracle indexing within our query and still get the results into mariaDB? It seemed that we needed to “push down” the SQL query to the Oracle end of the CONNECT “pipe”. To do this, we realised that we only needed a single mariaDB CONNECT table but that table would need the SRCDEF parameter adding to it. SRCDEF allows you to execute SQL on the remote database system instead of in mariaDB. The SRCDEF needed to contain a PL/SQL query as it would be running native to Oracle. Our new CONNECT statement looked like this:

CREATE TABLE CONNECT_Remote_Data_Count

ENGINE=CONNECT

TABLE_TYPE=ODBC

TABNAME=TableA

CONNECTION=’Driver={Oracle ODBC driver};Server=://xxx.xxx.xxx.xxx:1521/ORCL;UID=USERID;PWD=PASSWORD;’

SRCDEF=’…PL/SQL equivalent of PSEUDO SQL: Count the entries on TableA that are also on TableB…”

However, when we executed a “Select count(*) from CONNECT_Remote_Data_Count” we received a strange result – 1. The answer was returned very quickly, which was encouraging. However, we knew that this wasn’t the correct answer – we expected many thousands of entries to be on both tables. After a little more head scratching, we tried “Select * from CONNECT_Remote_Data_Count” and viola – our expected result was returned. In effect we were selecting the content of the CONNECT table’s query.

So we now had an Oracle PL/SQL query that was wrapped inside a mariaDB CONNECT “pipe” and being executed remotely in an Oracle database where it could make full use of the indexing. The result was then the only data item being sent down the “pipe” from Oracle to mariaDB.

The final solution looked like this:

maria_connect_pic2

So, as we can see, CONNECT is a powerful thing. It allowed us to build a solution that populated our mariaDB system with results from a query against two tables sat on an Oracle database. The full power of the indexing was utilised and the results were returned in a very fast time.

If you’d like to know more about how we are using CONNECT, then just get in touch.

 

Posted in Connectivity, Data Flow, Databases, MariaDB, Oracle, Uncategorized | Tagged , , , , , , , , , , | Comments Off on MariaDB CONNECT – Avoiding the pitfalls!

Continuous Lifecycle London 2016 – Conference Notes

Who was there

Big names:  Jez Humble and Dave Farley (authors of Continuous Delivery), and Katherine Daniels (Etsy).

Reportedly there were 270 delegates (it certainly felt like it).

Vendors

In general, thin on the ground – New Relic, HPE, Jet Brains, Automic, Chef, Serena, CloudBees and Perforce.

We didn’t see Red Hat or Puppet, and there were no service companies with a stand, although plenty of consultancies on the speaker list.

We asked some of the vendors about Docker support (which ended up feeling a bit like pulling garlic on vampires) and responses varied from “we don’t really have anything there” to “we’ve got something coming soon.”

Favourite Moments / Thoughts / Quotes

@drsnooks: microservices (n,pl); an efficient device for transforming business problems into distributed transaction problems.

“The Chaos Snail – smaller than the Chaos Monkey, and runs on shell.”

“With the earlier revolution (virtualisation), every tool that runs on bare metal also runs on a VM. With the container revolution, this is not true.” (Dima Stopel from Twistlock)

“Tools will not fix a broken culture.”

Katherine Daniels ending her talk with a passionate speech on the need for more diversity in the IT industry.

“Continuous Delivery != Continuous Deployment.” (Jez Humble and Dave Farley repeatedly)

Puppet Should Charge Per-Stream Royalties for Their Report

Memo To All Consultants: It Is Now Time To Stop Quoting The 2014/5 Puppet Labs State of DevOps Report.

Jez Humble probably got away with it by speaking first 🙂

Platforms In The Wild

An entirely unscientific sample of platforms that people are using in the wild for continuous delivery, microservices and DevOps:

Organisation Platform
Financial Times 45 micro services running on Docker + CoreOS + AWS

Were using private cloud but now use AWS. Live.

Pearson Technologies Docker + Kubernetes + OpenStack. Two AWS availability zones and one private cloud. Not yet live.
Home Office Docker + Kubernertes.
<private chat> NServiceBus, Azure, Rabbit MQ.
Government Digital Service (gov.uk) Open Source Cloud Foundry + vCentre. Preparing for move to OpenStack on AWS.
Azure Container Service Reputed to be using Mesos …

Personal Opinion:

For infrastructure-as-a-service, AWS is starting to sound like the choice for the early majority as well as the early adopters. Organisations with sensitive information requirements are already positioning themselves for the arrival of UK data centres. Relatively little mention of Cloud Foundry or Heroku – Docker is the topic du jour.

The objection to ‘rolling your own container platform’ is the amount of work you have to do around orchestration, logging, monitoring, management and so on. This didn’t seem to be putting people off, nor were we seeing much mention of frameworks such as Rancher.

Further Reading

Empathy: The Essence of DevOps – Jeff Sussna
http://blog.ingineering.it/post/72964480807/empathy-the-essence-of-devops

Why Every Company Needs Continuous Delivery – Sarah Goff-Dupont
http://blogs.atlassian.com/2015/10/why-continuous-delivery-for-every-development-team/

Posted in AWS, Cloud, Conferences, Continuous Integration, Docker | Tagged , , , , , | Comments Off on Continuous Lifecycle London 2016 – Conference Notes

Continuous Integration with Docker and Jenkins – Not So Easy

DockerImages
TL;DR: It takes a few minutes to pull a Jenkins container, it takes a few weeks of hard work to get it playing nicely with Docker.

Intro

We wanted to build a CI pipeline to do automated deployment and testing against our containerised web application. And we picked the most mainstream, vanilla technology set we could:

jenkins-technology-soup
Our Reasoning

[1] The link between hosted GitHub repositories and hosted Docker Hub builds is lovely.

[2] Triggering Jenkins jobs from Docker Hub web hooks *must* be just as lovely.

[2] There *must* just be a Jenkins plugin to stand up Docker applications.

Reality Bites #1 – Docker Hub Web Hooks

These aren’t reliable. Sometimes Docker Hub builds timeout if the queue is busy, so the web hook never fires. But the upstream change has still happened in GitHub, and you still need your CI pipeline to run.

Our Solution

We changed our Jenkins job to be triggered by GitHub web hooks. Once triggered, our job called a script that polled Docker Hub until it spotted a change in the image identifier.

Reality Bites #2 – So there is a Jenkins Plugin …

… but it doesn’t work, and is now dormant. The main issue is that authentication no longer works since the Docker API 2.0 release but there is a reasonable list of other issues.

Our First Solution

We looked at Docker in Docker https://blog.docker.com/2013/09/docker-can-now-run-within-docker/ and Docker outside Docker https://forums.docker.com/t/using-docker-in-a-dockerized-jenkins-container/322 The later had some success and we were able to execute docker commands though this wasn’t scalable as you are limited to a single Docker engine which may or may not be an issue depending on the scale of your set up.

Our Second Solution

We set up a Jenkins Master Slave configuration. The master is a Dockerised version of Jenkins image (it doesn’t need to have access to Docker in this configuration). The slave is another Linux instance (in this case we are using AWS). Our instance is fairly light weight – it is a standard t2.micro (which is free tier eligible) AWS Linux instance on which is installed SSH, Java, Maven and Docker.

A user is created that has permission to run docker and access to a user created folder /var/lib/Jenkins. The Jenkins master can then run the slave via SSH and we can confine Jenkins jobs to only run on that slave and run shell scripts such as docker pull. This is fully extensible and allows for parallel job execution and segregation of Jenkins job types e.g. compile on one slave, docker on another and so on.

Reality Bites #3 – I’m sorry, can you just explain that last bit again?

The Jenkins Docker image is tempting as an easy way to get Jenkins, but it creates a new problem which is “controlling a target Docker Engine from inside a Docker container controlled by another Docker Engine.”

If you create a Jenkins “slave” on a separate host, your Jenkins Docker container can happily send commands to that slave via SSH. Your “slave” is just a VM running Jenkins alongside Docker Engine, so you can run shell scripts locally on the “slave” to call docker compose.

Summary

The hard bit of this is getting from a nice diagram (http://messageconsulting.com/wp-content/uploads/2016/03/ContinuousBuildAndIntegration02.png) to a set of running applications deployed either as containers or native processes on a set of hosts that are all talking to each other, and to your upstream source code repository. Plenty of people have done it already, but be prepared to do some head scratching, and write some bash scripts!

Posted in Continuous Integration, Docker, Jenkins | Tagged , , , | Comments Off on Continuous Integration with Docker and Jenkins – Not So Easy

Three Amigos, One Cucumber and Where to Stick it in Jenkins

CucumberLogoThis article is aimed at the stalwart of software development, the Test Manager! The scenario: your boss has been on a jolly and has heard the term Cucumber whilst enjoying the free bar! Apparently, using a cucumber has helped his friend/rival steal a march on the market and it’s your job to work out how you can repeat this success using a long, green-skinned fruit! This article is aimed at giving the Test Manager enough information to understand what on earth Three Amigos would do with cucumber in an automated way. Don’t worry, we promise to avoid all food analogies or allergies, sorry!

Ok, let’s think of some common things we do during our Test Planning and Automation Phases, what usually goes wrong and how we can fix it.

The first thing we try and do is understand the Application Under Test (this is true if you are working in Agile or Waterfall or whatever), this typically involves amongst other things a workshop with the Business Analyst, The Development Team and the Testers. I make that a count of three, aha! The Three Amigos! Of course, this meeting can involve a whole host of others though the point is there are three groups of people present, or in jargon three domains. These groups are trying to come to a shared understanding of the requirements which typically results in three sets of documentation each with its’ own vocabulary. The long running specification workshop eventually wraps up, with each group relatively content that they know what they are doing and can carry out their respective tasks. The Test Manager and Team set about their business only to discover some way in to the process that there have been several misunderstandings and the tests don’t validate the customer requirements and even though it’s a no-blame culture, people want to know whose fault it is that it’s not working. Sound familiar? Wouldn’t it be nice if there was a tool that could effectively close this gap in understanding, using a shared language and at the same time give us a flying start in the automation process? Well brace yourself for Cucumber!

I made a common mistake when first looking into Cucumber – I took it to be a pure test automation tool. I missed the point, it really is a superb way of collaborating. The Three Amigos (yes it was taken from the film) work together in short meetings (hopefully no more than an hour) on a regular basis to arrive at a shared understanding of how the software should behave and capture this behaviour in scenarios. Well, that’s nothing new you say! The clever bit is the way that the Scenario is captured; Cucumber makes use of Feature files. These files are in plain English and have a very light weight structure. For example, at Redhound we have developed a product called Rover Test Intelligence and below is the actual feature file we use to test a particular scenario, without any other form of documentation can you tell what the product and the test do?

Feature: Rover Categorise Data
As a Release Manager I want to be able to categorise unexpected
differences in data between Production and UAT whilst ignoring 
irrelevant fields
Scenario: User categorises data
Given That I am logged in to the Rover Categorisation screen
When I select a difference group
And I launch the categorise process
Then I can tag the record with a difference label

Try another

Feature: Rover See Data Differences
As a Release Manager I want to be able to see differences in data 
between Production and UAT whilst ignoring irrelevant fields
Scenario: User views differences
Given That I am logged in to the Rover Categorisation screen
When I select a difference group
And I launch the see differences process
Then I can view the differences in data between the two record sets

As you can see this is, hopefully, understandable to most people reading it. And, this is important, the steps “Given, When, And & Then” can be interpreted by a computer so, that is Three Amigos, a Cucumber and a Laptop! It may not be obvious but this feature file is written in a Language called Gherkin. The Feature files can be developed in their totality outside the Three Amigos specification meeting so long as a feedback loop is in place to ensure the Amigos stay friendly.

When I say it can be interpreted by a computer there is work to do here. At this point the Test Engineers get busy; however, when you run Cucumber it takes the Feature file steps and creates a skeleton type framework and the Test Engineers then have to put the meat on the bones – no, Cucumber will not write your automation tests you still have to code them.

At Redhound we are using IntelliJ IDEA for the Integrated Development Environment, Maven for dependency management and Java as the language of choice. With this set up, when you run a Feature file for the first time, rover_see_data_differences.feature, Cucumber will helpfully generate the following:

//You can implement missing steps with the snippets below:
@Given("^That I am logged in to the Rover Categorisation screen$")
public void That_I_am_logged_in_to_the_Rover_Categorisation_screen() throws Throwable {
// Express the Regexp above with the code you wish you had
    throw new PendingException();
}

@When("^I select a difference group$")
public void I_select_a_difference_group() throws Throwable {
    // Express the Regexp above with the code you wish you had
    throw new PendingException();
}

@When("^I launch the see differences process$")
public void I_launch_the_see_differences_process() throws Throwable {
    // Express the Regexp above with the code you wish you had
    throw new PendingException();
}

@Then("^I can view the differences in data between the two record sets$")
public void I_can_view_the_differences_in_data_between_the_two_record_sets() throws Throwable {
    // Express the Regexp above with the code you wish you had
    throw new PendingException();
}

Granted, the above does look a little more technical, though you can recognise that the steps for the Feature file are now linked via a regular expression to Java code, brilliant! The generated code snippet can be cut and paste directly into Java class and the process of developing your automated tests and indeed your software can begin. Your own test code is placed in the auto generated method bodies replacing the “throw new PendingException();” statement.

The real advantage here is that there is a shared understanding of what the feature steps mean, a so called ubiquitous language; the developers can make it, the testers can break it, and the Business Analyst can see that the product is being developed in line with actual requirements and the tests are sound. This is an iterative process that goes under the guise of Behavioural Driven Development, the desired behaviour drives the development and test! Another term you may see used for the same process is “Specification by Example” (2). Though the irony is not lost in using Cucumber and Gherkin to describe the tool which in no way describes what it is or does, still, it is catchy!

Ok, pause for a breath…

To recap, Cucumber should be thought of as a collaboration tool that brings the Three Amigos together in order to define, using examples, a scenario. When Cucumber runs it generates helpful skeleton code that can be filled in by Test Engineers to create an automated acceptance test framework. The cumulative behaviours in all of the Feature files will eventually equate to the specifications for the system.

ThreeAmigosAndTesting-1027x222

Now, how to link Cucumber in to your Continuous Integration and Deployment framework. We have discussed Continuous Integration & Deployment with Docker, Jenkins & Selenium however, it can be confusing to see just how all these bits link together…

The way we do it is to have our automated tests safely tucked away in Git Hub. We have Jenkins and Git Hub linked together using the inbuilt facility of Git Hub – Web Hooks. A change in the base code will trigger Jenkins to run the job. The source code uses Maven for dependency management, which in turn uses Profiles – this is simply a collection of tests under a Test Suite. Jenkins is configured to execute Maven Tests so that test suites can be run accordingly. (See (3) for diagram).

We can’t finish without mentioning that maximum benefits are achieved if you use Cucumber in an Agile Testing framework. You get all the benefits of short iterations, quickly finding defects, less handovers due to multi-disciplinary teams etc, etc. However, just collaborating in the Three Amigos style can assist you no end in understanding what you are supposed to be testing.

Final Summary – Cucumber can be thought of as a collaboration tool, that in conjunction with a Specification by Example process can bring enormous benefits to your automation test efforts. Cucumber won’t write your automation tests for you though it creates skeleton code from a plain English Feature file. If the Three Amigos (a Business Analyst, a Test Analyst and a Developer) work together in short bursts a common understanding and language is achieved, greatly increasing your chances of delivering successful software. We actively encourage the adoption of this approach to enable you to achieve your goals.

Posted in Test Automation, Testing | Tagged , , , , , , | 1 Comment