The customer was interested in developing innovative secure large file sharing solution for business anywhere in the world over FTP that would solve the related problems at local and global levels.
Client Objectives
The client’s work with the enterprises requires services of the highest quality only. After the company decided to order Python development in Quintagroup, both sides started close collaboration on the project realization, following agile methodology to design and program:
- reliable solution for large data synchronization between multiple remote data centers
It was required that the solution would keep the data on four data centers consistent and accurate over time. - effective synchronization should be performed maximally fast, using dedicated bandwidth.
Quintagroup Solution
Challenge
The project was a real challenge in terms of the different skills, expert knowledge and technologies required. We would need to identify frameworks, integrate a variety of technologies and products. Additionally, the customer’s requirements were not frozen that called for effective requirements change management. The project had to be managed in real-time concurrently with the development of third-party products.
Tight coordination and well-rounded team of web developers were key to deliver a product with consistent architecture and robust core. Previously, the customer had successfully worked with Quintagroup’s on other projects, utilizing the specialized skills of the team. The client had been satisfied with Quintagroup’s professionalism and expertise, so it was decided to engage us on a long-term basis, developing the Data synchronization solution.
Planning
Most existing data transfer protocols have been designed for local-area network (LAN) applications in which buffer sizes far exceed the bandwidth-delay product (BDP). The bandwidth-delay product or BDP determines the amount of data that can be in transit in the network and depends on available bandwidth and the latency, or RTT. RTT ( RoundTrip Time ) is the amount of time it takes for a packet of data to get from one designated point to another and back to the sender.
Since the client cooperates at global levels as well, we needed to examine BDP for the widely used data transfer protocol TCP. The calculated high bandwidth-delay product is an indicator of a long fat network (LFN) and TCP inability to perform effective large data synchronization between remote data centers.
In order to eliminate those types of issues that frequently occur while using TCP, upon considerate planning and research Quintagroup decided to develop a solution, relying on UDT. The reason why is that UDT is a highly configurable easy-to-use protocol that allows transferring data at a much higher speed than its competitor.
Development
Our development team members worked together on the common prototype and then developed the solution. Quintagroup’s Python developers were responsible for designing and developing a series of Python packages that included accompanying tools to provide seamless data synchronization.
For establishing a communication channel between numerous individual products we utilized ZMQ, also known as ZeroMQ, 0MQ - a high-performance asynchronous messaging library. After developing each separate program, next, we needed to implement a supervising utility - we opted for Circus, a Process and Socket manager. There are several reasons why we decided on Circus; it's programm+able in Python, it can bind sockets and handle logging, also Circus has a web console for monitoring, and sends all the events over a ZeroMQ PUB socket.
Oracle Solaris was used as a Unix operating system for enterprise cloud thanks to its reliability, security, and scalability while transferring large data volumes over pretty much any protocol and transport imaginable.
Verification and validation
Imitating such a scenario with four remote data centers having to synchronize data is quite a task. For that we wrote automated test cases - each would recreate a needed situation and at the same time, it helps regression testing and iterative development. Automated acceptance tests are essentially detailed examples of how the web solution is supposed to function when the requirement they describe is implemented. Tools applied: Robot framework for automated acceptance testing and Fabric for running commands on a remote server.
Successful outcome
The developed web solution satisfies all the client’s requirements and more:
- the system is reliable and secure enough to transfer large data volumes;
- it can operate over Wide Area Networks (WAN) and additionally was verified to perform correctly on mobile networks;
- a highly scalable and fast execution solution can provide synchronization without any data loss;
- it allows real-time synchronization of data across numerous geographically dispersed data centers;
Interested in learning more?
Quintagroup is a seasoned provider of web solutions and can give expert advice to assist your business or organization online. Contact us today to learn more.