A Big and Complex Interview

For this interview, we talked to Greg Wester, Senior Member Technical Staff, Craig Jennings, Senior Director, Quality Engineering and Ritu Ganguly, QE Director at Salesforce.

Salesforce.com is a cloud-based enterprise software company specializing in software as a service (SaaS). Best known for its Customer Relationship Management (CRM) product, it was ranked number 27 in Fortune’s 100 Best Companies to Work For in 2012.

What is big or complex about your system (users, physical size, data, load, distribution, safety, regulation, security, other)?

This should give you an idea. Salesforce processes 700 million highly complex business transactions per day for nearly 3 million active users, whose raw processing needs are growing at a 50% compounded annual rate. We expect to soon exceed 1 billion transactions a day across global 6 data centers housing 20 computing clusters, which we call “pods”. We have a multitenancy architecture where each customer’s data lives with a group of other customers in one of these pods. Within a pod we have a horizontally scaled application tier on x86 commodity hardware that, among other things, also hosts a distributed cache. We have a home grown message queueing system that allows asynchronous processing to be scheduled in either the database tier or the application tier.

Do you remember any remarkable event that changed your mind about how big or complex your system is?

This company has a strong culture of “putting our money where our mouth is”, so it should be no surprise that we use our own product to run our business. It’s also well-known that Salesforce employees collaborate, share, and align on our corporate social networking product, Chatter. When Chatter was still under development, our founder, Marc Benioff, encouraged every employee to share their vision and goals document on their Salesforce profile on Chatter. As a result, one of our non-customer facing pods showed a brief performance decrease while we added physical storage to the file servers. This affirmed that monitoring and management tools are often as important as the product software itself in achieving high uptime. You have to watch what’s happening, you have to respond quickly, and you have to learn from what’s happened. Our early movement towards being an open social enterprise exceeded estimations. However, we were prepared by our DNA of using our own product to run our own business.

Do you document testing as you have in the past, or has documentation become leaner even with a big or complex system?

At Salesforce, we think automated test cases describe how a feature works far better and more efficiently than a design document. We’re an Agile shop, so our design documentation isn’t voluminous. However, the aggregate of tests that have passed and failed, are a more current, accurate, detailed, and up-to-date description of our product than any written test documents. We have made it more efficient based on our customer’s needs. We have reviewed traditional test plans/strategies and kept what is needed but our philosophy is lean: less documentation and more testing. That’s not to say we don’t do test planning. Our Quality Engineers must first think about the feature at a high level. We built a tool that encourages thinking about the feature at a high level. You must understand the customer’s use cases, list out all your assumptions, and plan out a testing strategy is. After this though, we get the engineer right into their coding environment and keep them there. They’ll stub out their test cases and write the intent of the test case and expected result in Javadoc. When we check those tests in, we have a tool that parses the Javadoc, and sticks the test case name, description, expected result into our test case repository automatically. It’s quite intelligent and keeps the engineer productive since they don’t have to switch contexts at all.

What type SDLC do you follow? Have you found limitations in SDLC due to the size of the system you support?

Salesforce has its own flavor of Agile called the Adaptive Delivery Methodology, or ADM. It’s pretty much textbook Scrum with a few twists. Product Owners prioritize features from a backlog based on customer interest, and business opportunity. Teams of four to a dozen engineers in development, quality, performance, user experience, and documentation meet in a daily stand up meeting and collaborate to deliver “potentially releasable” features each iteration, which can be chosen by the individual team. Most are on two-week iterations, but some are on week-long sprints. Our customer base requires that we introduce features with ample notice and staging beforehand on sandbox environments. We are very careful about what goes into a patch release, because a fix for one customer can turn out to be a bug to another.

What is the biggest problem you face in delivering your system to users?

Our platform is basically an ecosystem that is built and managed by us, but controlled by the customers. Our sales are increasing at over 38% year over year. As a result of this success, the performance tuning tweaks we’ve verified and deployed to our system today may be suboptimal a year from now, even if we made no major code changes. Scale is key. We know we have some of the best Technical Operation and R&D teams in the world. Their coordinated success is ultimately the foundation of our business model.

How has your testing strategy changed as your system got bigger or more complex?

Definitely, we have had to look at our customer’s complex implementation needs, complex business processes and customizations and ensured we represent real customer scenarios in our testing. As we grew at an enormous rate, we learned how support cases that escalate to R&D drag the velocity of feature work. They mire teams in bug fixing. Too many bug fixes in patch releases also introduces risk to the product. The goal of our testing strategy is to minimize the amount of time supporting the feature after it’s released. In other words, our aim must be to find all of the impactful bugs, corner cases, and quirks. We leverage tests written by customers in our Apex Code language to verify their use cases before each major release. Since some bugs are inevitable even with a very thorough process, we then put effective monitoring and management systems in place so that we can react to issues immediately when they arise.

Did you change the experience or job requirements for test engineers as a result of a bigger or more complex system?

Salesforce’s customers have expectations that our system will have minimal downtime each year, and no unscheduled downtime. We are delivering a service at a scale where every test must be automated, and most features have more SLOCs of test code than application code. We hire only software engineers into Quality Engineering who can perform white and black box testing. Not every development engineer has the instinct to be a test engineer, and vice versa. Quality Engineering requires solid programming skills, a laser focus on customer service, a knack for risk management, and an eye for hidden or low frequency/high impact bugs. We have also expanded our performance testing team.

Have you changed your reliance on test automation due to size or complexity?

We rely on it in increasing amounts. This is the only way we can continue scaling. We think test automation has a maturity model by which you can measure the commitment to quality within an organization:

  • Level 1 is unit test coverage, representing a certification by the developer that individual implementations of software classes function in a particular way.
  • Level 2 includes functional testing of a module of software classes.
  • Level 3 is end to end testing of every module in a particular application while it is running on a single host or node.
  • Level 4 is testing the application under load for an extended period with all of its supporting subsystems including database, cache, message queues, etc.
  • Level 5 has the same parameters as Level 4, with the added requirement that every piece of hardware, every operating system library, and every configuration is as it appears in a production environment with customer data.

When we start seeing diminishing returns from one level, we move to the next.

What percentage of your tests are automated?

Over 90%. Our goal is that no teams run manual tests, unless you’re counting on exploratory testing (which every team does). That last 10% is amazingly difficult. We’re forced to test manually when the tools for automation are in their infancy, such as on Mobile platforms. We’ve made impressive leaps forward in these areas, but there’s still work to do. Talk to us next year. We’ll have solved some of those problems and be closer to 100%.

What do you see in the future for testing big or complex systems?

Mainstream tools like JUnit were designed for unit testing a single class but have evolved to accommodate complex functional testing scenarios. Unit testing on a simple piece of code has a binary outcome: pass or fail. Functional testing on a live distributed system with components designed on loose service level agreements to accommodate graceful degradation and failure of neighbors is different. The tools for this are in their infancy and require engineers as creative and talented as the ones who designed the system to write frameworks for testing it. This presents an opportunity for a thought leader to emerge with an industry standard for distributed software testing. We’re proud of our accomplishments in this area, and are aiming for that goal.

LogiGear Corporation
LogiGear Corporation provides global solutions for software testing, and offers public and corporate software testing training programs worldwide through LogiGear University. LogiGear is a leader in the integration of test automation, offshore resources and US project management for fast, cost-effective results. Since 1994, LogiGear has worked with Fortune 500 companies to early-stage start-ups in, creating unique solutions to meet their clients’ needs. With facilities in the US and Viet Nam, LogiGear helps companies double their test coverage and improve software quality while reducing testing time and cutting costs.

The Related Post

The huge range of mobile devices used to browse the web now means testing a mobile website before delivery is critical.
Test automation can provide great benefits to the software testing process and improve the quality of the results…. but its use must be justified and its methods effective. The reasons to automate software testing lie in the pitfalls of manual software testing… As we all know too well, the average manual software testing program:
“Testing Applications on the web” – 2nd EditionAuthors: Hung Q. Nguyen, Bob Johnson, Michael HackettPublisher: Wiley; edition (May 16, 2003) This is good book. If you test web apps, you should buy it!, April 20, 2001By Dr. Cem Kaner – Director of Florida Institute of Technology’s Center for Software Testing Education & Research Book Reviews ...
The challenges with any automation effort is to know your capability. I’ve seen too many automation efforts begin and end with a tool decision. Generally these tools are very complex pieces of software that do many more things then we would ever use in our normal everyday testing. It even adds more misery to the ...
We’ve scoured the internet to search for videos that provide a wealth of knowledge about Test Automation. We curated this short-list of videos that cover everything from the basics, to the more advanced, and why Test Automation should be part of part of any software development organization. Automation Testing Tutorial for Beginners This tutorial introduces ...
The success of Automation is often correlated to its ROI. Here are 5 KPIs that we find universally applicable when it comes to quanitfying your Test Automation.
When automated tests are well-organized and written with the necessary detail, they can be very efficient and maintainable. But designing automated tests that deal with data can be challenging if you have a lot of data combinations. For example, let’s say we want to simulate a series of 20 customers, along with the number of ...
In recent years, much attention has been paid to setting up Test Automation frameworks which are effective, easy to maintain, and allow the whole testing team to contribute to the testing effort. In doing so, we often leave out one of the most critical considerations of Test Automation: What do we do when the Test ...
Automated Testing is a huge part of DevOps, but without human-performed quality assurance testing, you’re increasing the risk of  lower-quality software making it into production.  Automated Testing is an essential DevOps practice to increase organizations’ release cadence and code quality. But there are definitely limits to only using Automated Testing. Without human quality assurance (QA) ...
TestArchitect TM is the name we have given to our automation toolset. It reflects the vision that automated testing requires a well-designed architectural plan allowing technical and non-technical elements to work fluidly in their capacity. It also addresses the continual missing link of all test automation tools of how to design tests. In TestArchitect the test ...
An automation framework is a way to organize your code in meaningful manner so that any person who is working with you can understand what each file contains. Automation frameworks differ based on how you organize your code – it can be organized based on your data, so that any person who wants to use ...
September Issue 2019: Advancing Automation

Leave a Reply

Your email address will not be published.

Stay in the loop with the lastest
software testing news

Subscribe