A Big and Complex Interview

For this interview, we talked to Greg Wester, Senior Member Technical Staff, Craig Jennings, Senior Director, Quality Engineering and Ritu Ganguly, QE Director at Salesforce.

Salesforce.com is a cloud-based enterprise software company specializing in software as a service (SaaS). Best known for its Customer Relationship Management (CRM) product, it was ranked number 27 in Fortune’s 100 Best Companies to Work For in 2012.

What is big or complex about your system (users, physical size, data, load, distribution, safety, regulation, security, other)?

This should give you an idea. Salesforce processes 700 million highly complex business transactions per day for nearly 3 million active users, whose raw processing needs are growing at a 50% compounded annual rate. We expect to soon exceed 1 billion transactions a day across global 6 data centers housing 20 computing clusters, which we call “pods”. We have a multitenancy architecture where each customer’s data lives with a group of other customers in one of these pods. Within a pod we have a horizontally scaled application tier on x86 commodity hardware that, among other things, also hosts a distributed cache. We have a home grown message queueing system that allows asynchronous processing to be scheduled in either the database tier or the application tier.

Do you remember any remarkable event that changed your mind about how big or complex your system is?

This company has a strong culture of “putting our money where our mouth is”, so it should be no surprise that we use our own product to run our business. It’s also well-known that Salesforce employees collaborate, share, and align on our corporate social networking product, Chatter. When Chatter was still under development, our founder, Marc Benioff, encouraged every employee to share their vision and goals document on their Salesforce profile on Chatter. As a result, one of our non-customer facing pods showed a brief performance decrease while we added physical storage to the file servers. This affirmed that monitoring and management tools are often as important as the product software itself in achieving high uptime. You have to watch what’s happening, you have to respond quickly, and you have to learn from what’s happened. Our early movement towards being an open social enterprise exceeded estimations. However, we were prepared by our DNA of using our own product to run our own business.

Do you document testing as you have in the past, or has documentation become leaner even with a big or complex system?

At Salesforce, we think automated test cases describe how a feature works far better and more efficiently than a design document. We’re an Agile shop, so our design documentation isn’t voluminous. However, the aggregate of tests that have passed and failed, are a more current, accurate, detailed, and up-to-date description of our product than any written test documents. We have made it more efficient based on our customer’s needs. We have reviewed traditional test plans/strategies and kept what is needed but our philosophy is lean: less documentation and more testing. That’s not to say we don’t do test planning. Our Quality Engineers must first think about the feature at a high level. We built a tool that encourages thinking about the feature at a high level. You must understand the customer’s use cases, list out all your assumptions, and plan out a testing strategy is. After this though, we get the engineer right into their coding environment and keep them there. They’ll stub out their test cases and write the intent of the test case and expected result in Javadoc. When we check those tests in, we have a tool that parses the Javadoc, and sticks the test case name, description, expected result into our test case repository automatically. It’s quite intelligent and keeps the engineer productive since they don’t have to switch contexts at all.

What type SDLC do you follow? Have you found limitations in SDLC due to the size of the system you support?

Salesforce has its own flavor of Agile called the Adaptive Delivery Methodology, or ADM. It’s pretty much textbook Scrum with a few twists. Product Owners prioritize features from a backlog based on customer interest, and business opportunity. Teams of four to a dozen engineers in development, quality, performance, user experience, and documentation meet in a daily stand up meeting and collaborate to deliver “potentially releasable” features each iteration, which can be chosen by the individual team. Most are on two-week iterations, but some are on week-long sprints. Our customer base requires that we introduce features with ample notice and staging beforehand on sandbox environments. We are very careful about what goes into a patch release, because a fix for one customer can turn out to be a bug to another.

What is the biggest problem you face in delivering your system to users?

Our platform is basically an ecosystem that is built and managed by us, but controlled by the customers. Our sales are increasing at over 38% year over year. As a result of this success, the performance tuning tweaks we’ve verified and deployed to our system today may be suboptimal a year from now, even if we made no major code changes. Scale is key. We know we have some of the best Technical Operation and R&D teams in the world. Their coordinated success is ultimately the foundation of our business model.

How has your testing strategy changed as your system got bigger or more complex?

Definitely, we have had to look at our customer’s complex implementation needs, complex business processes and customizations and ensured we represent real customer scenarios in our testing. As we grew at an enormous rate, we learned how support cases that escalate to R&D drag the velocity of feature work. They mire teams in bug fixing. Too many bug fixes in patch releases also introduces risk to the product. The goal of our testing strategy is to minimize the amount of time supporting the feature after it’s released. In other words, our aim must be to find all of the impactful bugs, corner cases, and quirks. We leverage tests written by customers in our Apex Code language to verify their use cases before each major release. Since some bugs are inevitable even with a very thorough process, we then put effective monitoring and management systems in place so that we can react to issues immediately when they arise.

Did you change the experience or job requirements for test engineers as a result of a bigger or more complex system?

Salesforce’s customers have expectations that our system will have minimal downtime each year, and no unscheduled downtime. We are delivering a service at a scale where every test must be automated, and most features have more SLOCs of test code than application code. We hire only software engineers into Quality Engineering who can perform white and black box testing. Not every development engineer has the instinct to be a test engineer, and vice versa. Quality Engineering requires solid programming skills, a laser focus on customer service, a knack for risk management, and an eye for hidden or low frequency/high impact bugs. We have also expanded our performance testing team.

Have you changed your reliance on test automation due to size or complexity?

We rely on it in increasing amounts. This is the only way we can continue scaling. We think test automation has a maturity model by which you can measure the commitment to quality within an organization:

  • Level 1 is unit test coverage, representing a certification by the developer that individual implementations of software classes function in a particular way.
  • Level 2 includes functional testing of a module of software classes.
  • Level 3 is end to end testing of every module in a particular application while it is running on a single host or node.
  • Level 4 is testing the application under load for an extended period with all of its supporting subsystems including database, cache, message queues, etc.
  • Level 5 has the same parameters as Level 4, with the added requirement that every piece of hardware, every operating system library, and every configuration is as it appears in a production environment with customer data.

When we start seeing diminishing returns from one level, we move to the next.

What percentage of your tests are automated?

Over 90%. Our goal is that no teams run manual tests, unless you’re counting on exploratory testing (which every team does). That last 10% is amazingly difficult. We’re forced to test manually when the tools for automation are in their infancy, such as on Mobile platforms. We’ve made impressive leaps forward in these areas, but there’s still work to do. Talk to us next year. We’ll have solved some of those problems and be closer to 100%.

What do you see in the future for testing big or complex systems?

Mainstream tools like JUnit were designed for unit testing a single class but have evolved to accommodate complex functional testing scenarios. Unit testing on a simple piece of code has a binary outcome: pass or fail. Functional testing on a live distributed system with components designed on loose service level agreements to accommodate graceful degradation and failure of neighbors is different. The tools for this are in their infancy and require engineers as creative and talented as the ones who designed the system to write frameworks for testing it. This presents an opportunity for a thought leader to emerge with an industry standard for distributed software testing. We’re proud of our accomplishments in this area, and are aiming for that goal.

LogiGear Corporation
LogiGear Corporation provides global solutions for software testing, and offers public and corporate software testing training programs worldwide through LogiGear University. LogiGear is a leader in the integration of test automation, offshore resources and US project management for fast, cost-effective results. Since 1994, LogiGear has worked with Fortune 500 companies to early-stage start-ups in, creating unique solutions to meet their clients’ needs. With facilities in the US and Viet Nam, LogiGear helps companies double their test coverage and improve software quality while reducing testing time and cutting costs.

The Related Post

This article was developed from concepts in the book Global Software Test Automation: A Discussion of Software Testing for Executives, by Hung Q. Nguyen, Michael Hacket and Brent K. Whitlock Introduction The top 5 pitfalls encountered by managers employing software Test Automation are: Uncertainty and lack of control Poor scalability and maintainability Low Test Automation ...
For those that are new to test automation, it can look like a daunting task to undertake For those who are new to Automation, it can look like a daunting task to undertake, but it only seems that way. If we unpack it and pinpoint the fundamentals, we can have the formula for the desired ...
The 12 Do’s and Don’ts of Test Automation When I started my career as a Software Tester a decade ago, Test Automation was viewed with some skepticism.
What is Ethereum Smart Contract Testing? What are its challenges? If you’re new to Smart Contract Testing, this in-depth guide will prepare you on how to test smart contracts successfully. Blockchain stands out due to its enormous implications. Everyone has heard of it, but few people know what the ramifications are for testers or how ...
Introduction As a consultant and trainer, I am often asked by my clients and students how to deal with automated acceptance tests. One common question is whether automated acceptance tests should use the graphical user interface (GUI) provided by the application.
Learn how to leverage TestArchitect and Selenium for turnkey, Automated Web testing. TestArchitect lets you create, manage, and run web-based automated tests on different types of browsers—using either a WebDriver or non-WebDriver technique. In this article, we will explore employing WebDriver for testing a web-based application with TestArchitect. TestArchitect with WebDriver is a tool for automating ...
This article was developed from concepts in the book Global Software Test Automation: Discussion of Software Testing for Executives. Introduction There are many potential pitfalls to Manual Software Testing, including: Manual Testing is slow and costly. Manual tests do not scale well. Manual Testing is not consistent or repeatable. Lack of training. Testing is difficult ...
Take 5 and test your knowledge on the Software Testing essentials covered in our March 2019 issue. Good luck!
One of the basic challenges with test automation is adoption. I can’t tell you how many times I’ve cataloged licenses for a company and found out they already have many different automation software packages, none of which is being used. Traditionally I’ve been told that is because the tools don’t work and that the teams ...
With the new year just around the corner, here’s a look at the Test Automation trends that have the potential to dominate. DevOps is being relied upon more than ever. With there being strong Market Drivers for the adoption of DevOps, the need for Test Automation has also never been greater. But what’s next after ...
Divide and conquer was a strategy successfully employed by ancient Persian kings against their Greek enemies. It is a strategy that can still be used successfully today. Fundamentally, by dividing something into smaller more manageable pieces (in the case of the ancient Persians, they divided the Greek city states), it becomes much more manageable.
The Cloud demands that we be as nimble as possible, delivering features and fixes in almost real-time fashion. Both customer and provider rely on software development that can maintain quality while being light on its feet and constantly moving. In addition, Cloud-oriented systems tend to be highly complex and dynamic in structure — more than ...

Leave a Reply

Your email address will not be published.

Stay in the loop with the lastest
software testing news

Subscribe