Managing your computing ecosystem Pt. 3

Managing your computing ecosystem Pt. 3

Overview

The prospect of universal and interoperable management interfaces is closer to reality than ever. Not only is infrastructure converging, but so is the control and management plane. Last time, we discussed Redfish for managing hardware platforms. This time we will talk about Swordfish for managing storage.

Swordfishswordfish

The goal of Swordfish is to provide scalable storage management interfaces. The interfaces are designed to provide efficient, low footprint management for simple direct attached storage with the ability to scale up to provide easy to use management across cooperating enterprise class storage servers in a storage network.

The Swordfish Scalable Storage Management API specification defines extensions to the Redfish API. Thus a Swordfish service is at the same time a Redfish service. These extensions enable simple, scalable, and interoperable management of storage resources, ranging from direct attached to complex enterprise class storage servers. These extensions are collectively named Swordfish and are defined by the Storage Networking Industry Association (SNIA) as open industry standards.

Swordfish extends Redfish in two principal areas. The first is the introduction of the management and configuration based on service levels. The other is the addition of management interfaces for higher level storage resources. The following sections provide more detail on each.

Service based management

Swordfish interfaces allow the client to get what they want without having to know how the implementation produces the results. As an example, a client might want storage protected so that no more than 5 seconds of data is lost in the event of some failure. Instead of specifying implementation details like mirroring, clones, snapshots, or journaling, the interface allows the client to request storage with a recovery point objective of 5 seconds.   The implementation then chooses how to accomplish this requirement.

The basic ideas are borrowed from ITIL (a set of practices for IT service management that focuses on aligning IT services with the needs of business) and are consistent with ISO/IEC 20000.

A Swordfish line of service describes a category of requirements. Each instance of a line of service describes a service requirement within that category. The management service will typically be configured with a small number of supported choices for each line of service. The service may allow an administrator to create new choices if it is able to implement and enforce that choice. To take an example from airlines, you have seating as one line of service with choices of first, business, and steerage. Another line of service could be meals, with choices like regular, vegetarian, and gluten free. Lines of service are meant to be independent from each other. So, in our airline example, we can mix any meal choice with any seating choice.

Swordfish provides three lines of service covering requirements for data storage, (protection, security, and storage), and two lines of service covering requirements for access to data storage, (connectivity and performance).   Swordfish leaves the specification of specific choices within each of these lines of service to management service implementations.

A Swordfish class of service resource describes a service level agreement (SLA). If an SLA is specified for a resource, the service implementation is responsible for assuring that level of service is provided. For that reason, the management service will typically advertise only a small number of SLAs. The service may allow an administrator to create new SLAs if it is able to implement and enforce that agreement.   The requirements of an SLA represented by a class of service resource are defined by a small set of line of service choices.

Swordfish storage

Swordfish starts with Redfish definitions and then extends them. Redfish specifies drive and memory resources from a hardware centric point of view.   Redfish also specifies volumes as block addressable storage composed from drives. Redfish volumes may be encrypted. Swordfish then extends volumes and adds filesystems, file shares, storage pools, storage groups, and a storage service.   (Object stores are intended to be added in the future.)

A storage service provides a focus for management and discovery of the storage resources of a system.  Two principal resources of the storage service are storage pools and storage groups.

A storage pool is a container of data storage capable of providing capacity that conforms to a specified class of service. A storage pool does not support IO to its data storage.  The storage pool acts as a factory to provide storage resources (volumes, file systems, and other storage pools) that have a specified class of service. The capacity of a storage pool may come from multiple sources and are not all required to be of the same type. The storage pool tracks allocated capacity and may provide alerts when space is low.

A storage group is an administrative collection of storage resources (volumes or file shares) that are managed as a group. Typically, the storage group would be associated with one or more client applications. The storage group can be used to specify that all of its resources share the characteristics of a specified class of service. For example a class of service specifying data protection requirements might be applied to all of the resources in the storage group.

One primary purpose of a storage group is to support exposing or hiding all of the volumes associated with a particular application. When exposed, all clients can access the storage in the group via the specified server and client endpoints. The storage group also supports storage (or crash) consistency across the resources in the storage group.

Swordfish extends volumes and adds support for file systems and file shares, including support for both local and remote replication. Each type supports provisioning and monitoring by class of service. The resulting SLA based interface is a significant improvement for clients over the current practice where the client must know the individual configuration requirements of each product in the client’s ecosystem. Each storage service lists the filesystems, endpoints, storage pools, storage groups, drives and volumes that are managed by the storage service.

Recommendations

These three specifications should form the basis for any Restful system management solution.

As a starting point, OData provides a uniform interface suitable for any data service. It is agnostic to the functions of the service, but it supports inspection of an entity data model via an OData conformant metadata document provided by the service. Because of the generic functionality of the Restful style and with the help of inspection of the metadata document, any OData client can have both syntactic and semantic access to most of the functionality of an OData service implementation. OData is recommended as the basis for any Restful service.

Redfish defines an OData data service that provides a number of basic utility functions as well as hardware discovery and basic system management functions. A Redfish implementation can be very light-weight.   All computing systems should implement a Redfish management service. This recommendation runs the gamut from very simple devices in the IOT space up to enterprise class systems.

Finally, Swordfish extends the Redfish service to provide service based storage management. A Swordfish management service is recommended for all systems that provide advanced storage services, whether host based or network based.

Universal, interoperable management based on well-defined, supported standards. It may still seem like an impossible hope to some. Every day, however, we move closer to a more standard, more manageable infrastructure environment.

~George Ericson @GEricson

Managing Your Computing Ecosystem Pt. 2

Managing Your Computing Ecosystem Pt. 2

Overview

We are making strides toward universal and interoperable management interfaces. These are not only interfaces that will interoperate across one vendor or one part of the stack, but management interfaces that can truly integrate your infrastructure management. Last time, we discussed OData, the Rest standardization. This time we will talk about Redfish for managing hardware platforms.

Redfish redfish

Redfish defines a simple and secure, OData conformant data service for managing scalable hardware platforms. Redfish is defined by a set of open industry standard specifications that are developed by the Distributed Management Task Force, Inc. (DMTF).

The initial development was from the point of view of a Baseboard Management Controller (BMC) or equivalent. Redfish management currently covers bare-metal discovery, configuration, monitoring, and management of all common hardware components. It is capable of managing and updating installed software, including for the operating system and for device drivers.

Redfish is not limited to low-level hardware/firmware management. It is also expected to be deployed to manage higher level functionality, including configuration and management of containers and virtual systems.   In collaboration with the IETF, Redfish is also being extended to include management of networks.

The Redfish Scalable Platforms Management API Specification specifies functionality that can be divided into three areas: OData extensions, utility interfaces, and platform management interfaces. These are described briefly in the following sections.

Redfish OData extensions

Redfish requires at least OData v4 and specifies some additional constraints:

  • Use of HTTP v1.1 is required, with support for POST, GET, PATCH, and DELETE operations, including requirements on many HTTP headers
  • JSON representations are required within payloads
  • Several well-known URIs are specified
    • /redfish/v1/ returns the ServiceRoot resource for locating resources
    • /redfish/v1/OData/ returns the OData service document for locating resources
    • /redfish/v1/$metadata returns the OData metadata document for locating the entity data model declarations.

Redfish also extends the OData metamodel with an additional vocabulary for annotating model declarations. The annotations specify information about, or behaviors of the modeled resources.

Redfish utility interfaces

The utility interfaces provide functionality that is useful for any management domain (for example, these interfaces are used by Swordfish for storage management). These interfaces include account, event, log, session, and task management.

The account service manages access to a Redfish service via a manager accounts and roles.

The event service provides the means to specify events and to subscribe to indications when a defined event occurs on a specified set of resources. Each subscription specifies where indications are sent, this can be to a listening service or to an internal resource, (e.g. a log service).

Each log service manages a collection of event records, including size and replacement policies. Resources may have multiple log services for different purposes.

The session service manages sessions and enables creation of an X-Auth-Token representing a session used to access the Redfish service.

The task service manages tasks that represent independent threads of execution known to the redfish service. Typically tasks are spawned as a result of a long running operation.

The update service provides management of firmware and software resources, including the ability to update those resources.

Redfish platform management interfaces

The principal resources managed by a Redfish service are chassis, computer systems and fabrics. Each resource has its current status. Additionally, each type of resource may have references to other resources, properties defining the current state of the resource, and additional actions as necessary.

Each chassis represents a physical or logical container. It may represent a sheet-metal confined space like a rack, sled, shelf, or module. Or, it may represent a logical space like a row, pod, or computer room zone.

Each computer system represents a computing system and its software-visible resources such as memory, processors and other devices that can be accessed from that system. The computer system can be general purpose system or can be a specialized system like a storage server or a switch.

Each fabric represents a collection of zones, switches and related endpoints. A zone is a collection of involved switches and contained endpoints. A switch provides connectivity between a set of endpoints.

All other subsystems are represented as resources that are linked via one or more of these principal resources. These subsystems include: bios, drives, endpoints, fans, memories, PCIe devices, ports, power, sensors, processors and various types of networking interfaces.

Conclusion

Redfish delivers a standardized management interface for hardware resources. While it is beginning with basic functionality like discovery, configuration and monitoring, it will deliver much more. It will extend into both richer services and cover more than physical resources – e.g. virtual systems, containers, and networks. Redfish is built as an OData conformant service, which makes it the second connected part of an integrated management API stack. Next up – Swordfish.

~George Ericson @GEricson

How I Learned to Stop Worrying and Love New Storage Media: The Promises and Pitfalls of Flash and New Non-volatile Memory

How I Learned to Stop Worrying and Love New Storage Media: The Promises and Pitfalls of Flash and New Non-volatile Memory

I tried to avoid learning about flash. I really did.  I’ve never been one of those hardware types who constantly chase the next hardware technology.  I’d rather work at the software layer, focusing on data structures and algorithms.  My attitude was that improving hardware performance raises all boats, so I did not have to worry about the properties of devices hiding under the waves.  Switching from hard drives to flash asthe common storage media, would just make everything faster, right?

Working for a large storage company broke me out of that mindset, though I still fought it for a few years. Even though I mostly worked on backup storage systems–one of the last hold-outs against flash–backup storage began to see a need for flash acceleration.   I figured we could toss a flash cache in front of hard drive arrays, and the system would be faster.  I was in for a rude awakening.   This multi-part blog post outlines what I have learned about working with flash in recent years as well as my view on the direction flash is heading.  I’ve even gotten so excited about the potential of media advances that I am pushing myself to learn about new non-volatile memory devices.

Flash Today

For those unfamiliar with the properties of flash, here is a quick primer. While a hard drive can supply 100-200 read/write operations per second (commonly referred to as input/output operations per second or IOPS), a flash device can provide 1000s – 100,000s IOPS.  Performing a read or write to a hard drive can take 4-12 milliseconds, while a flash device can typically respond in 40-200 microseconds (10-300X faster).  Flash handles more read/writes per second and responds more quickly than hard drives.  These are the main reasons flash has becoming widespread in the storage industry, as it dramatically speeds up applications that previously waited on hard drives.

If flash is so much faster, why do many storage products still use hard drives? The answer: price.  Flash devices cost somewhere in the range of $0.20 to $2 per gigabyte, while hard drives are as inexpensive as $0.03 per gigabyte.  For a given budget, you can buy an order of magnitude more hard drive capacity than flash capacity.  For applications that demand performance, though, flash is required.  On the other-hand, we find that the majority of storage scenarios follow an 80/20 rule, where 80% of the storage is cold and rarely accessed, while 20% is actively accessed.  For cost-conscious customers (and what customer isn’t cost conscious?), a mixture of flash and hard drives often seems like the best configuration.  This leads to a fun system design problem.  How do we combine flash devices and hard drives to meet customer requirements?  We have to meet the IOPS, latency, capacity and price requirements of varied customers.  The initial solution is to add a small flash cache to accelerate some data accesses while using hard drives to provide a large capacity for colder data.

A customer requirement that gets less attention, unfortunately, is lifespan. This means that a storage system should last a certain number of years without maintenance problems, such as 4-5 years.  While disk drives fail in a somewhat random manner each year, the lifespan of flash is more closely related to how many times it has been written.  It is a hardware property of flash that storage cells have to be erased before being written, and flash can only be erased a limited number of times.  Early flash devices supported 100,000 erasures, but that number is steadily decreasing to reduce the cost of the device.   For a storage system to last 4-5 years, the flash erasures have to be used judiciously over that time.   Most of my own architecture work around using flash has focused on the issues of maximizing the useful data available in flash, while controlling flash erasures to maintain its lifespan.

The team I have been a part of pursued several approaches to best utilize flash. First, we tried to optimize the data written to flash. We cached the most frequently accessed portions of the file system, such as index structures and metadata that are read frequently.   For data that changes frequently, we tried to buffer it in DRAM as much as possible to prevent unnecessary writes (and erasures) to flash.  Second, we removed as much redundancy as possible.  This can mean deduplication (replacing identical regions with references), compression and hand-designing data structures to be as compact as possible.  Enormous engineering effort goes into changing data structures to be flash-optimized.  Third, we sized our writes to flash to balance performance requirements and erasure limits.  As writes get larger, they tend to become a bottleneck for both writes and reads.  Also, erasures decrease because the write size aligns with the internal erasure size (e.g. multiple megabytes).  Depending on the flash internals, the best write size may be tens of kilobytes to tens of megabytes.  Fourth, we created cache eviction algorithms specialized for internal flash erasure concerns.  We throttled writes to flash and limited internal rearrangements of data (that also cause erasures) to extend flash lifespan.

Working with a strong engineering team to solve these flash-related problems is a recent highlight of my career, and flash acceleration is a major component of the 6.0 release of Data Domain OS.  Besides working with engineering, I have also been fortunate to work with graduate students researching flash topics, which culminated in three publications.  First we created Nitro, a deduplicated and compressed flash cache.  Next, Pannier is a specially-designed flash caching algorithm that handles data with varying access patterns.  Finally, we wanted to compare our techniques to an offline-optimal algorithm that maximized cache reads while minimizing erasures.  Such an algorithm did not exist, so we created it ourselves.

In my next blog post, I will present technology trends for flash. For those that can’t wait, the summary is “bigger, cheaper, slower.”

~Philip Shilane @philipshilane

How To Get Things Done

How To Get Things Done

“How can we get anything done across products?”

That was the theme of the 2016 EMC Core Technologies Senior Architect Meeting. Every year, we gather the senior technical leaders to discuss market directions, technology trends, and our solutions. This year included evolving storage media, storage connectivity, Copy Data Management, Analytics, CI/HCI, Cloud, and more. While the technical topics generated discussion and debate, the passion was greatest around – “How can we get anything done across products?” Each Senior Architect got to their position by successfully driving an agenda in their product groups, so they find their lack of cross-product influence to be exceptionally frustrating.

While the challenge may sound unique to senior leaders in a large organization, it’s a variant of the most common question I get from techies of all levels across all companies: “How can I get things done?”

What’s the Value?

Engineers – if your idea does not either generate revenue or save cost, you’re going to have a difficult time generating interest from business leaders, sales, and customers. Everybody loves talking about exciting technology, but they pay for solutions to business problems.  Too often, engineers propose projects that customers like, but would not pay for.

An internal team once proposed a project that would make our UI look “cooler”. I asked what it would do for the customer. It wouldn’t eliminate a user task. It wouldn’t help them get more done. But they were convinced it would be more “fun” which would convince more enterprises to buy the product. Not surprisingly, we didn’t pursue that project.

I recently met a startup with very exciting technology, but I couldn’t see how/why anybody would pay for it. The founder looked me in the eye and said, “People will love it so much, that they’ll just send me checks in the mail. But I’ll only cash the big ones, since smaller companies shouldn’t have to pay.” I started laughing at his joke, then felt really guilty(OK, sort of guilty) when I realized he was serious.

As you think about your value, it’s preferable to focus on revenue generation. Customers and executives would rather invest in solutions that increase their revenue rather than those that save costs. Cost saving discussions are either uncomfortable (and then you lay off ‘n’ people) or hard to justify (if you spend a lot of money today, you’ll save even more… in three years ). On the other hand, everybody likes talking about generating new revenue.

My Executive Briefing Center sessions often come after either Pivotal or Big Data discussions. The customers are excited about CloudFoundry, analytics, and new development techniques because it allows them to more quickly respond to customers and generate new revenue streams. As I walk in, they’re excitedly inviting the Pivotal presenter to dinner. After I discuss backup or storage, they say, “Thanks, this should help us reduce our costs. We still wish it weren’t so expensive, though.” Oh, and they NEVER invite me to dinner. Because nobody likes the “cost cutting” person. Or nobody likes me. Either one.

What are the Alternatives?

Technical people tend to make three mistakes when pitching an idea.

Mistake 1: Leading the audience through your entire thought process.

First, most senior people don’t have the attention span (I blame a day full of 30 minute meetings) to wait for your conclusion. Quickly set context, then get to the conclusion. Be prepared to support your position, but let them question you, don’t pre-answer everything. Second, most people don’t problem solve the same way you do, so your “obvious” thought path may not be clear to others. Finally, the longer you talk, the less likely you are to have a conversation. Your audience wants to be involved in a decision; that only happens when they can express their viewpoint and know that you’ve understood it.

Mistake 2: Not presenting actions

Let’s say you’ve made an astounding presentation. The audience is engaged. You’ve had a great discussion. Everybody supports the conclusion. And… you walk away. Too often, engineers forget to add: “And here’s what we need to do.” If you don’t ask for something to be done, nothing will be done.

Mistake 3: Not presenting alternatives

People and executives (some of whom display human characteristics) want to feel like they have some control over things. That means they want to be able to make choices. They also want to believe that you, the presenter, have considered many alternatives before drawing your conclusion. To satisfy both needs, you must present two or three (more than that and it’s overwhelming) legitimate approaches that address the challenge. If you don’t they’ll feel like you’re trapping them.

One of my worst presentations was titled – “Large file system restores are slow.” I spent an hour walking through 23 slides detailing the pain of restoring large file systems (both by capacity and file count). At the end, the Sr. Director said, “We knew it was slow. That’s why we hired you. Are you saying that we can’t hire someone to solve this, or that we just made the wrong hire?” Now THAT is an example of quickly presenting actionable alternatives.

Who are You Selling To?

As you sell your idea, you need tailor the pitch to your audience.

  • What actions can you ask for? If your audience doesn’t control resources or roadmaps, then ask them for what they can give – support, personal time, etc. Conversely, if your audience can make decisions, ask for the decision. It’s better to get a “no” than to drift forever.
  • What does your audience care about? Business leaders want to hear about revenue, routes to market, investment costs, etc. Your demo may be the coolest thing ever, but it won’t move them until you get them interested. Technical leaders generally care about both, but be careful about losing them on a deep dive. Technical experts want the deep dive. Engineers want to know what work they need to do.
  • What is their background? If you’re selling an idea to non-experts, you’ll need to spend more time setting context (business, technical , etc.). If you’re talking to experts, don’t waste their time with the basics.

In other words, there is no “one size fits all” presentation. It may be more work to tailor your approach to each audience, but nobody said this was easy.

When I first started working with customers, I would race through my presentation – always doing it the same way. I was too nervous to ask what the audience was interested in hearing. As I talked, I’d never give the audience a chance to respond. I considered myself lucky if the audience sat in silence, so that I could quickly exit, drenched in sweat. One day, I walked into the Briefing Center, saw 2 people in suits sitting there, and rattled through my 30 minute talk. At the conclusion, one of them said, “That was good. That was a lot of the content we want to cover. Just so you know, the customer is running late, but they should be here soon.”

Conclusion

How do you get things done? You convince people. You need to convince business leaders, peers across groups, technical experts, and the engineers who will actually do the work. Whether you’re a new college graduate or a technical leader with decades of experience, the formula doesn’t change:

  • What’s the value?
  • What are the alternatives?
  • Who is the audience?

If you follow these guidelines, you may not always get the decision you like… but you will get a decision. And “getting decisions about actions” is the only way you can get anything done.

-Stephen Manley @makitadremel

Cloud Native for The Enterprise

Cloud Native for The Enterprise

In Part I of this series, we explored how the Heroku architecture wires middleware software to their platform, allowing developers to focus on writing applications to drive their business. In this part, focusing on the enterprise use case, we’ll analyze the Heroku inspired PaaS system – Cloud Foundry.

 

Part II – Cloud Foundry – Cloud Native for the Masses

 

Cloud Foundry – The PaaS for the Enterprise

 

Cloud Foundry (CF) is considered by many to be the PaaS for the enterprise – and for good reasons. With mature APIs, built-in multi tenancy, authentication and monitoring it’s no wonder vendors like IBM, HP and GE built their platforms as certified CF distributions. Cloud Foundry can be thought of as a customizable implementation of the Heroku architecture. The main difference between Heroku and CF is the flexibility that allows CF to be installed anywhere. Like Heroku, CF adopted the strict distinction between the stateless 12-factor part of the App (“the application space”) and the stateful part of the App (“Services”). While the application space is very similar to the equivalent on Heroku, CF can’t depend on Heroku engineers to manage the services (e.g. databases/messaging). Instead, CF delegates this role to DevOps. As a result, enterprises can configure the platform to expose custom services that meet their unique needs. This sort of adaptability plays in favor of CF in the enterprise world.

 

Some of the major Cloud Foundry qualities that make it attractive in this space:

 

Avoid lock-in – With years of experience of being locked in by software and infrastructure stacks, enterprises seek freedom of choice. With the CPI abstraction (Cloud Provider Interface), Cloud Foundry (using BOSH) can be deployed on practically any infrastructure.

 

Mature on-premises use cases – The cloud is great! Enterprises are not passing up on that trend. However, reality has its own role in decision making, and many workloads are staying on premises. While security, regulations and IT culture are often cites, what keeps a large portion of the workloads on premises is the years and years of legacy systems holding the most important asset of any organization – its data. In order to move mission critical workloads to new architectures on a remote datacenter (e.g. cloud), organizations have to port all the proprietary non-portable data too. Translating for readers with cloud native mindsets: DB2 on mainframe is not yet “Dockerized”, one can’t simply push it around in a blink of an eye. J

 

Good Transition Story – CF can do more than run legacy workflows on premises. The big difference is that it provides a well-defined architectural transition story. The ability to move parts of the app or new modules to run as CF apps, while easily connecting to the legacy data systems as services (via service brokers), is powerful. This allows developers to experiment with the modern technology while accessing the core systems, giving them a real opportunity to experience the cloud native productivity boosts while keeping their feet firmly on the ground.

 

Compliance, Compliance and again Compliance – Many cloud native discussions leave out where data is being stored. We often hear that cloud native apps only use NoSQL or Big Data databases. While using modern databases makes sense in some use cases, in order to deliver compliant applications, organizations find it easy and safe to use mature database back-ends (e.g. Oracle/MS SQL) for serving their modern cloud native apps. With Cloud Foundry’s Service Brokers model, they are able to leverage the tools and processes they are already proficient with to protect their data, while modernizing their apps and workflows.

 

Vendor behind the platform – although Cloud Foundry is open source, Enterprises like having a throat to choke . Proprietary distributions like Pivotal Cloud Foundry support the transformation and can be engaged when customers encounter challenges.

 

While the motivation to modernize exists in almost every organization these days, not seeing a clear path for transformation can hold companies back. Many of the modern platforms (e.g. Kubernetes or Docker Datacenter) have an appealing vision for where the world of software needs to be, but it is not clear how to make a gradual transformation. By adopting CF, enterprises see an steady path that they can pursue and start getting results relatively quickly.

 

Cloud Foundry – Trouble in Paradise

 

Cloud Foundry is an exciting platform that can bring many benefits to organizations adopting it. However, there is always room to improve. CF’s flexibility in handling of stateful services via the “Service Brokers” abstraction and DevOps management is what made the “transition story” possible. However, as always, when making something more generic, there are tradeoffs. Since Heroku manages both stateful and stateless services, they can gain full control over the platform. Without that control, some pain points emerge.

cloud native 2 pic 1

Cloud Native New Silos

 

On one hand, the application space is modern and fully automated, while on the other hand the services space and its integration points with the app space have a quite a manual feel. It’s no surprise that the the platform shortcomings revolve around lack of control and lack of visibility caused by the new “DevOps Silo” the CF architecture enforces. For example:

 

Scale out – Cloud Foundry can do an impressive job scaling out the stateless services according to the load on the application. But sometimes, when the compute hassle has been resolved, the next bottleneck becomes the database. If the platform could control stateful services as well, it could scale the database resources as well. Today developers have to pick up the phone and call dev-ops, asking for specific services tuning.

 

Multi-Site HA – Production systems often run on more than a single site due to HA requirements and/or regulations. In order to support such deployment topologies, someone has to make the stateful services available on multiple sites when required. As the CF runtime ignores stateful services, there has to be an out-of-platform process of orchestrating stateful services. From the CF perspective, on every site you run a different application, and the coupling of those sites is not visible to CF. Seeing such deployment topologies in real life, I can testify that it is painful and error prone.

 

Portability – Great! CF is portable! Is it? Let’s say I run Facebook on my datacenter using CF. It would be effortless to point my CF client and AWS and push Facebook to a CF instance running on AWS. Job done. Or is it? If you are following so far you might have noticed that in my new Facebook production site on AWS, I’m going to be very lonely. In fact – I’m not going to have any friends! While the stateless services are portable, the stateful ones, that more than everything are the core value of any application, are certainly not portable.

 

Test & Dev – When new architectures emerge, they often focus on solving the primary use case of running in production. Only later, as the technology matures, do the toolsets and patterns for application maintenance come. To be fair, Cloud Foundry does have nice patterns for the application lifecycle, like blue-green deployments. However this is not enough especially in the enterprise. When maintaining an application, developers often need access to instances with fresh data from production. In the past, they had to speak with a DBA to extract and scrub the data for them, or even had a self-service way for doing this. In the modern world of micro-services, applications tend to use multiple smaller databases rather than one monolith. In addition, the new dev-ops silo means that there is one additional hop in the chain for doing this already complex operation. I’ll elaborate on the developer use case in the next post.

 

Data protection – While I have heard more than once that data protection is not required in cloud native, where the data services are replicating data across nodes, I’ve also witnessed organizations lose data due to human error (expiring full S3 buckets without backup) or malicious attacks (e.g. dropping entire schema). When crisis happens, organizations need the ability to respond quickly and restore their data. The CF platform hides the data services from us and creates yet another silo in the already disconnected world of data protection. Storage, backup admins, DBAs and now – DevOps.

 

Cost Visibility – When adopting modern platforms, more and more processes become automatic. This empowers developers to do more. However, with great power (should) come great responsibility. Companies complain that in large systems they lose visibility into the resources cost per application and use case. While with CF you can limit an application to consume some amount of compute resources, you get no control or visibility on stateful services cost (which in many cases is the most significant). For example, developers can run test-dev workloads attaching them to expensive stateful services while they could have used cheaper infrastructures for their test workloads. With Big Data Analytics services there is also lack of visibility in production. There is no ability to distinguish which app is utilizing what percent of the shared resource, and therefore the ability to prioritize and optimize.

Cloud native 2 pic 2

Native Application Data Distribution example

 

As the technology matures, and more organizations will take it to production and the community will start catching up with solutions. Some initiatives are already being thrown into the air (e.g. BOSH releases for stateful services) and offer some enhanced features (although always with trade-offs). I’ve recently seen a demo by Azure architects that runs Pivotal Cloud Foundry in Azure. Since they have full control of the stateful services, they could preconfigure Cloud Foundry marketplace to work seamlessly. Even more impressive, they stated they are working on supporting this configuration on premises using Azure virtualization software on customer hardware. With the tradeoff of being locked with Microsoft, having the ability to control and get visibility to the full spectrum of the application is certainly an attractive offering.

 

Conclusion

Cloud Foundry is bringing the Cloud Native environment to the enterprise ecosystem. By creating a more flexible model that depends on Dev Ops, it solves some of the challenges with bringing the Heroku model to the enterprise. Of course, there are new complexities that arise with the Cloud Foundry model. In particular, adding Dev Ops makes the stateful (i.e. persistent data) operations more challenging – especially for developers.

 

In the next article we’ll discuss the role of developers in the Cloud Native Enterprise. While cloud native applications empower developers to do more, in the enterprise world there are implications that create an interesting dissonance.

 

Amit Lieberman @shpandrak, Assaf Natanzon @ANatanzon, Udi Shemer @UdiShemer

Unintellectual Thoughts

Unintellectual Thoughts

Emptying the dresser drawer of my mind.

  • When will all-flash protection storage become the “hot new thing”? To deal with the increased scale of primary storage capacity and more demanding customer SLAs, the industry is moving from traditional tar/dump backups to versioned replication. Thus, protection storage needs to support higher performance data ingest and instant access recovery. It seems plausible that protection storage will follow the primary storage path: from disk-only to caching/tiering with flash to all-flash (with a cloud-tier for long-term retention).
  • When will custom hardware come back? The industry has pivoted so hard to commodity components, it feels like the pendulum has to swing back. Will hyper-converged infrastructure drive that shift? After all, where better to go custom than inside a completely integrated end-to-end environment (as with the mainframe)?
  • Are job candidates the biggest winners in Open Source? Companies continue to struggle to make money in Open Source. Whether the monetization strategy is services, consulting, management & orchestration, or upsell, it’s been a tough road for Open Source companies. On the other hand, Open Source contributions are like an artist’s portfolio for an engineer– far more useful than a resume. Even better, if you can become influential in Open Source, you can raise your profile with prospective employers.
  • When will NAS evolve (or will it)? It’s been decades since NAS first made it easy for users to consolidate their files and collaborate with their peers in the same site. Since then, the world has evolved from being site-centric (LAN) to global-centric (WAN). Despite all the attempts – Wide-Area File Services (WAFS), WAN accelerators, sync and share – files still seem constrained by the LAN. Will NAS stay grounded or expand beyond the LAN? Or will object storage will simply be the evolution for unstructured data storage and collaboration?
  • What’s the future of IT management? Analytics. We’ve spent decades building element managers, aggregated managers, reporting tools, ticketing systems, processes, and layers of support organizations to diagnose and analyze problems. As infrastructure commoditizes, we should be able to standardize telemetry. From that telemetry, we can advise customers on what to do before anything goes wrong. If companies like EMC can make technology that reliably stores ExaBytes of storage around the world, we should be able to make technology to enable customers to not have to babysit those systems.
  • Will Non-Volatile Memory be the disruption that we thought Flash would be? Flash didn’t disrupt the storage industry; it was a media shift that the major platforms/vendors have navigated. (Flash did disrupt the disk drive industry.) The non-volatile memory technologies, however, could be more disruptive. The latency is so small that the overhead of communicating with a storage array exceeds that of the media. In other words, it will take longer to talk to the storage array than it will to extract data from the media. To optimize performance, applications may learn to write to local Non-Volatile Memory, shifting storage out of the primary I/O path. Maybe that will be the disruption we’ve all being talking about?
  • What happens when storage and storage services commoditize? The general consensus is that the commoditization of IT infrastructure is well under way. Most people feel the same about storage and storage services (e.g. replication, data protection, etc.) As commoditization happens, customers will choose products based on cost of purchase and management. As an infrastructure vendor, the question will be – how do we add value? One camp believes that the value will move to management and orchestration. I’m skeptical. Commoditization will lead to storage and services being embedded (e.g. converged/hyper-converged) and implicitly managed. Thus, I think there will be two paths to success. One path involves becoming a premier converged/hyperconverged player. The second revolves around helping customers understand and manage their data – on and off-premises. This means security, forensics, compliance, and helping users find what they need when they need it. Successful vendors will either deliver the end-to-end infrastructure or insight into the data. If you do both… then you’ve really got something. You can guess where I’d like Dell EMC to go.

I also wonder about whether software engineering jobs are following the path of manufacturing jobs, whether software-defined XYZ is a bunch of hooey, the future of networking, whether any of these big-data unicorns has a shot at success, and why people are so hysterical about containers. But we’ll save those incoherent thoughts for another time.

-Stephen Manley @makitadremel

Are You Sabotaging Your Career – The Meeting Mailbag

Are You Sabotaging Your Career – The Meeting Mailbag

Career questions. Everybody has them.

Hopefully you can discuss them with a mentor, a trusted manager, or a trustworthy group of friends. If you have no better options, however, you can always ask a long-winded, self-absorbed blogger. To establish my blowhard blogger credentials, I’ve shared my opinions about the importance of perception, committing the three most common mistakes,  successfully getting noticed, escaping the pigeonhole, talking to management, working with executives, and handling career honors.

The result? We have questions from our readers. This time we’ll cover the meeting questions.

Q: I’m terrible at meetings. Nobody listens to me even though I’m right most of the time. How can I get better at meetings? Product Manager, Hopkinton, MA

A: First, let’s be sure you really are terrible at meetings (though that “I’m right most of the time” gives me confidence that you are.).

  • When you begin speak, do people use that as a trigger to go on an unofficial bathroom break?
  • Do people never respond to what you say and act as if everybody had simultaneously gone comatose while you talked?
  • Do you hear groans or sighs when you begin to talk?
  • Do you see everybody suddenly pick up their phones while you’re speaking? Do you have a paranoid sensation that they’re mocking you via text? (Note: It’s probably not paranoia.)
  • Do you find a lot of decisions are made in “hallway conversations” and the number of meetings is reduced?

Sabotage Meeting pic 1If you answer yes to most of these questions: congratulations, you are really, really bad at meetings! (None of these are exaggerations. They happen in most big meetings I attend – to people at all levels and in all job roles.)

Now that we know you’re the “Batman vs. Superman” of meetings (lots of noise, no impact), how do you improve your performance at meetings?

First, understand what type of meeting it is, so you can act appropriately.

  • Decision Communication – Everything has been decided. Focus your feedback on improving the implementation of the decision. Everything else is disruptive. Disagree and commit.
  • Decision Discussion – There has been a preliminary proposal, and the presenters are seeking new information or perspectives. Your goal: convince the decision makers to meet with you after the meeting. Focus on concise and positive input that brings something new to the discussion (Shockingly, people tend to avoid those who are relentlessly negative and antagonistic.). If you try to go too deep in the meeting, you will be creating a “rat hole”. If you stray off-point, you’ll be pegged as passive-aggressive.
  • Topic Introduction – The presenter is trying to establish a basic, shared level of understanding. Your goal: ensure that you’re involved in the preliminary proposal. Focus your input on either establishing your credentials in the area (briefly and without ego) or making it clear that your team/group is integral to the decision (either implementing or being affected by it). Pushing for a decision will make you look rash and bossy.

Second, regardless of the type of meeting:

  • Brevity – Do you find it annoying when somebody drones on during a meeting? It’s just as annoying when you do it.
  • Relevance – Do you hate when people talk without adding anything new or applicable to the discussion? People hate you when you do it, too.
  • Be Positive – Do you resent people who ridicule your ideas (and, thus, you) in public? When you do it, other people fantasize about you getting stuck in never-ending bug reviews. [NOTE: You can be positive, even when critiquing an idea, by genuinely wanting to help the presenter succeed, rather than trying to prove your superiority. Intent shows through.]

Finally, there are a few habits to break immediately:

  • Body Language – Do you bang your head on the table, slump in your chair, or roll your eyes at the presenter?
  • Tone of Voice – Do you sound critical or condescending? Do you sigh deeply before you talk, as if everybody else’s stupidity wearies you?
  • Talking over others – We all know that you’ve just had the most brilliant thought since Vin Diesel thought of adding The Rock to The Fast and Furious. It’s clearly more important than whatever others are saying. So why bother listening, or even letting them finish?

Sabotage Meeting pic 2If so, you’re in good company – I’ve done all of those things. Now stop being disrespectful, annoying, unprofessional, and immature. Nobody will take you seriously when you act like a spoiled child.

One of my worst meetings – we were prioritizing management features for a product I’d led. The management team was just trying to understand the product (Topic Introduction). Unfortunately, I expected them to be experts and that the meeting was to ratify decisions (because I didn’t ask). After spending the meeting interrupting them, rolling my eyes, and insulting their lack of knowledge of the product, I concluded by spectacularly banging my head on the table 3 times. Word of the meeting spread quickly. One of my mentors asked if I’d taken up day drinking. The management team asked for a different liaison. For months people behaved differently around me – less open – because they feared I’d melt down again. And I gave myself a really bad headache.

You can learn to be good at meetings. It all starts with common sense, respect for others, and taking an extra moment to think about how others will perceive your actions and words. In other words – it starts with trying.

Q: I’m in Quality Engineering (e.g. Test, Automation). My manager got me invited to design reviews, but I don’t know what to say. The developers say that I’m not adding value, and they don’t want to invite me anymore. How can I make an impact? Engineer, RTP, NC

A: This is the flip side to the meeting question above. Instead of being overbearing and obnoxious, you’re unsure and becoming a meeting wallflower.

Most of the previous advice still applies. A few of other things to add:

  • Prepare: Design review meetings are new to you. You’ll need to work harder to make your mark. Prepare a couple of points of feedback prior to the meeting (even if you have to talk with some developers separately before the meeting). That way, you’re confident that you’ll have an impact. It will relax you, so you can add even more value in the flow of the meeting.
  • Focus on your unique value: As the QE representative, the developers are expecting you to bring a different perspective – e.g. customer advocacy, automation requirements, testability, diagnosability, or cross-group impact. Therefore, focus on that; don’t try to be a developer.
  • Advocate for your group: You’re not just in the meeting for yourself; you’re representing your team. Therefore, find out what your group needs from the developers. That will help you prepare and add unique value.

Sabotage Meeting pic 3When I first started attending business planning meetings, I struggled to add value as well. Initially, I sat quietly. Unfortunately, I worried that I wasn’t adding value. Then I tried to contribute by offering brilliant advice about budgets, margins, and sales models. The “are you kidding me?” looks on everybody’s faces convinced me that I definitely wasn’t adding value. Finally, I found my sweet spot – helping balance present and future investment in product development and marketing. And in making sure that I tell more bad jokes than anybody else.

 

Conclusion

Meetings matter. Not just for getting work done, but in setting your public image. Good luck and keep the questions coming!

Stephen Manley @makitadremel

Keeping it Core

Keeping it Core

EMC Core Technologies Division includes EMCs traditional File and Block Storage Technologies along with EMC Backup and Recover Solutions.  In this Podcast, we sit down with Scott Delandy (@ScottDelandy) Technology Director in the CTO group at the EMC Forum in New York City and chat about his session, ‘VNX and VMAX: A proven Hybrid Cloud Storage Foundation”.  We also discuss the how, what, and why across a range topics including storage, religion, and solving problems.