dellites.me

Hollis Tibbetts (Social Media Today)Socially Stephanie: Tips for a Successful Crowdfunding Campaign

"Dear Socially Stephanie, I have a great idea, but the problem is I don't have enough money to start my business. I was thinking about doing a crowdfunding campaign. Do you have any tips to help me make my campaign a success?"

Hollis Tibbetts (Social Media Today)What Brands Can Learn from the Best and Worst World Cup Tweets

The World Cup was the most tweeted about event in Twitter's history, with hundreds of big businesses attempting to use the tournament to help market their brand. But what can smaller businesses learn from the best, and the worst, of those tweets?

Hollis Tibbetts (Social Media Today)Should Brands Abandon Facebook?

Should Brands Abandon Facebook? My answer to this rhetorical question is yes - unless you are willing to rethink your Facebook strategy and spend money to sponsor your company page posts. While you do not have to spend a large sum of money, my experience has shown that Facebook organic reach is rapidly diminishing and it is almost not worth the effort to continue with the "free lunch" approach.

Hollis Tibbetts (Social Media Today)5 Reasons Why Animated Explainer Videos Boost Your Social Media Campaigns

What’s the deal with explainer videos? Are they useful for the current social media generation? Could they actually be your greatest social media marketing tool? Around seven years ago, blooming social networks were the “it” thing among online business experts. Almost every company was wondering and evaluating how far could they reach their goals with social media marketing.

Hollis Tibbetts (Social Media Today)Content Discovery Smackdown: Hootsuite vs. Buffer vs. Klout

Content discovery has become so important that tools once used for other purposes have now integrated this feature into their offerings. Three of these tools that we’re going to explore today include Hootsuite, Buffer and Klout.

Hollis Tibbetts (Social Media Today)Are Social Media Teams Shrinking or Changing?

Have you noticed that there is a changing in the role of the dedicated social media team in today’s digital marketing landscape? What is happening to our social media teams? Are they shrinking, or are they just changing?

Hollis Tibbetts (Social Media Today)Social Network vs. Online Community: What Is the Difference?

Underneath the large, all-encompassing term of social media, there are two types or sub-categories: Social Networks and Online Communities. Both have their perks and common uses, but what really are they and who should be using them? Here’s the breakdown.

Hollis Tibbetts (Social Media Today)Why SEO Basics Are Not Enough

The truth is that good SEO is often not a basic process. With Google’s ever-changing algorithms, there are new considerations all the time, so the basics might not bring you the results you seek.

Hollis Tibbetts (Social Media Today)The Importance of Comments

If you have a business blog that people read and comment on, you have a real-time focus group. If you answer the comments quickly, a conversation happens. People who read that conversation feel attached to your business because they got interested in your story, and that story is being told in the dialog of the comment section.

Hollis Tibbetts (Social Media Today)7 Website Tips to Attract More Shoppers to Your Pages

You’ve probably invested a lot of blood, sweat, and tears into the creation of your store’s website, but if no one knows about your site, how can they possibly buy from you? Here are the seven website tips for attracting shoppers to your website.

Hollis Tibbetts (Social Media Today)The Importance of Mobile Search in Retail

Studies have found that 30 percent of smartphone searchers and 25 percent of tablet users buy what they’re looking for within one hour. By making smart use of mobile features like the ability to add a one-touch phone number so that potential customers can call you, retailers can encourage immediate conversion.

Hollis Tibbetts (Social Media Today)The Power of First Impressions and Branding

To be effective, the image we are branding our business with needs to reflect our business positively and accurately. It needs to begin with the first impression and carry on from there. There’s no point in selling our business as one that cares if the products or services we deliver, and the way we deliver them, leave our customers disappointed.

Hollis Tibbetts (Social Media Today)Three Things "Guardians of the Galaxy" Can Teach You About Content

When Guardians of the Galaxy opens next Friday, Marvel Studios will score a huge opening. This is the latest in a string of successes for Marvel, and it is due to the brand’s adherence to some core tenets. You can apply many ideals of the Marvel method to your own work, and create content that succeeds in a crowded field.

Hollis Tibbetts (Social Media Today)Writing Better Content Faster

Don’t we all want to write awesome content faster and more consistently? Yes. I’m not referring to business-style blog posts, either, but anything you throw out to social media, your press releases, website content, and other material you publish on a regular basis. Time is, after all, the one thing separating us from publishing share-worthy material to billions of readers (see: exaggeration).

Hollis Tibbetts (Social Media Today)5 Critical Considerations when Using LinkedIn Sponsored Updates

The LinkedIn ecosystem is a very tempting target for B2B marketers. After all, you can target in ways you simply can't get away with in other advertisers. Someone's title, skills, location and company are all up for grabs. It should be like shooting fish in a barrel, but alas, like all magic bullets, this one still requires pinpoint marksmenship.

Hollis Tibbetts (Social Media Today)How to Dedicate the Time Required to Achieve Social Media Success

Why are we all so eager to spend as little time as possible on our business's social media marketing efforts? There tends to be a strong correlation between the time, energy and effort put toward implementing a smartly crafted social media strategy, and expected results.

Hollis Tibbetts (Social Media Today)17 Must-Dos to Get Your Content Read [INFOGRAPHIC]

A 17-step checklist of key tactics that will optimize and target your blog content to get in front of the people who matter, and generate leads.

Hollis Tibbetts (Social Media Today)Is This Digital Marketing or Delusional Marketing?

Marketing teams around the world are pushing the digital agenda. The better among them are scratching their heads about the ROI, how to create an effective integrated campaign and where to find the best talent. The answers are far from easy. Meanwhile, there are many that continue to bark up the wrong tree, investing in delusional marketing rather than effective digital marketing.

Hollis Tibbetts (Social Media Today)Choosing to Be Forgotten with Ephemeral Messaging

In a world where all of our online actions can be recorded, the abil­ity to have a reg­u­lar con­ver­sa­tion dig­i­tally is a refresh­ing change. If you go out for cof­fee with a friend, you don’t need to worry that every­thing you say is being recorded for all time. Ephemeral mes­sag­ing tries to recre­ate that nat­ural con­ver­sa­tion feeling.

Hollis Tibbetts (Social Media Today)How Omni-Channel Marketing Shapes the New Buyer's Journey

With the rapid growth of digital consumption and what seems like daily proliferation of social media channels, marketers are faced with more choices than ever when considering how they want to reach the consumer. From their cell phone to the desktop to an in store visit, we are entering an omni-channel world, where consumers seek an omni-channel experience.

Hollis Tibbetts (Social Media Today)Atri Chatterjee of Act-On Software on the New Generation of Marketers

According to a recent SiriusDecisions survey, only 16% of B2B companies use marketing automation, and many companies using some form of automation focus their efforts on less advanced areas, like email marketing and landing pages. Atri Chatterjee, CMO for marketing automation platform provider Act-On, discusses the changing landscape of marketing and what's driving the accelerated adoption of marketing automation tools, and who is leading the charge for a more analytical approach to customer engagement taking place today.

Hollis Tibbetts (Social Media Today)8 Great Tools for Online Community Managers

Unless you live under a rock and/or rarely utilize God’s greatest gift to this earth, otherwise known as the Internet, you know by now that content marketing is KEY. Whether it be for your personal brand, small business or corporation, content must to be upbeat, helpful, relevant, entertaining, interesting and more, all at once.

Hollis Tibbetts (Social Media Today)How to Use Pinterest for Business

The great thing about Pinterest is you can use it to direct traffic to your website, a product, or a blog post. With each pin you post or re-pin you can include a link. This link will direct users to your site as soon as they maximize the photo and click it for more information. Include a call to action in the description of your photo that will entice your target to continue on a click-through.

Hollis Tibbetts (Social Media Today)Why Facebook Advertising is Winning

Facebook advertising has made a raft of changes over the past 2 years that have made it a fantastic platform. Now, the results are also standing up to scrutiny and in many cases Facebook advertising is outperforming Google AdWords in terms of conversions. Find out how to achieve this and why this is occuring in this post.

Jason BocheLegacy vSphere Client Plug-in 1.7 Released for Storage Center

Dell Compellent Storage Center customers who use the legacy vSphere Client plug-in to manage their storage may have noticed that the upgrade to PowerCLI 5.5 R2 which released with vSphere 5.5 Update 1 essentially “broke” the plug-in. This forced customers to make the decision to stay on PowerCLI 5.5 in order to use the legacy vSphere Client plug-in, or reap the benefits of the PowerCLI 5.5 R2 upgrade with the downside being they had to abandon use of the legacy vSphere Client plug-in.

For those that are unaware, there is a 3rd option and that is to leverage vSphere’s next generation web client along with the web client plug-in released by Dell Compellent last year (I talked about it at VMworld 2013 which you can take a quick look at below).

Although VMware strongly encourages customers to migrate to the next generation web client long term, I’m here to tell you that in the interim Dell has revd the legacy client plug-in to version 1.7 which is now compatible with PowerCLI 5.5 R2.  Both the legacy and web client plug-ins are free and quite beneficial from an operations standpoint so I encourage customers to get familiar with the tools and use them.

Other bug fixes in this 1.7 release include:

  • Datastore name validation not handled properly
  • Create Datastore, map existing volume – Server Mapping will be removed from SC whether or not it was created by VSP
  • Add Raw Device wizard is not allowing to uncheck a host once selected
  • Remove Raw Device wizard shows wrong volume size
  • Update to use new code signing certificate
  • Prevent Datastores & RDMs with underlying Live Volumes from being expanded or deleted
  • Add support for additional Flash Optimized Storage Profiles that were added in SC 6.4.2
  • Block size not offered when creating VMFS-3 Datastore from Datacenter menu item
  • Add Raw Device wizard is not allowing a host within the same cluster as the select host to be unchecked once it has been selected
  • Add RDM wizard – properties screen showing wrong or missing values
  • Expire Replay wizard – no error reported if no replays selected
  • Storage Consumption stats are wrong if a Disk folder has more than one Storage Type

Post from: boche.net - VMware Virtualization Evangelist

Copyright (c) 2010 Jason Boche. The contents of this post may not be reproduced or republished on another web page or web site without prior written permission.

Legacy vSphere Client Plug-in 1.7 Released for Storage Center

Hollis Tibbetts (Ulitzer)Should Cloud Be Part of Your Backup and Disaster Recovery Plan?

The introduction of the Cloud has enabled the fast and agile data recovery process which is effectively more efficient than restoring data from physical drives as was the former practice. How does this impact Backup & Recovery, Disaster Recovery and Business Continuity initiatives? Cloud backup is the new approach to data storage and backup which allows the users to store a copy of the data on an offsite server - accessible via the network. The network that hosts the server may be private or a public one, and is often managed by some third-party service provider. Therefore, the provision of cloud solution for the data recovery services is a flourishing business market whereby the service provider charges the users in exchange for server access, storage space and bandwidth, etc.

read more

Mark CathcartOperational Transparency

I’ve had a couple of emails and facebook comments that asked, or at least inferred, was this a real email? Yes, it was. I do get questions like these from time to time, it’s unusual to get all them all in a single email. The final question in the email from my colleague was:

 How important do you think operational transparency is?

My response was again curt and to the point. I think. without context for the question, this was the best I can do.

Very. There are time when it is OK to be opaque there is never a time to be deceptive. Your manager should never tell you he/she will work on a promotion for you when they know they have no ability to deliver it. They should never tell you your position is solid, when they know it isn’t.

Austin Business Journal posted an article with a quote from a University of Texas (UT) expert on recent staff actions at Dell. I wrote a response/comment which I think nicely bookends this series of posts. Thanks for all the positive feedback.


Mark CathcartDealing with difficult people

Question 3. in the email, and my answer, is really why I ended up writing this short series of blog posts. Having read back what I’d written, I realized that that after a couple of good answers, I’d been pretty superficial with my 3rd. The question posed was:

How do you go about dealing with difficult people and company politics?

My response was:

See my answer to 1. And 2. above. Got a family? It’s not different. If you shout at kids, yours, nieces, nephews, how productive is that really? Sometimes you can bully people to change, it is almost always better to show them a better way.

This is indeed over simplistic, without context. Of course, it’s what you should do. The more you get embroiled in office politics, the more it is likely to distract you from your real value, being great at what you do. If being great at what you do is being difficult and company politics, well good luck with that, we all know people that have to some degree “made it” because they’ve been good at using system, for everyone of those though, there are 5 who made it because they are good at what they do.

Failing organizations and companies are ripe with people trying to control the system to their advantage; trying to cheat or deceive on their contributions, but my experience has always been that a rising tide lifts all boats.

Again, Nigel covers in the 3 Minute Mentor a goodr case where company politics come into play, where teams, departments are pitted against themselves, either deliberately or inadvertently, it’s worth watching or reading his show notes.

Still, I fall back on be good, have fun, do what you love and leave the politics to others.


Mark CathcartCareer goals and aspirations

Following on from yesterdays post, the 2nd question that came up in the email was:

What is the most effective way to build and achieve career goals?

Before I get to my answer, I’d like put in a plug for the 3 Minute Mentor website created, run and produced by long time friend, ex-colleague and fellow Brit’ ex-pat Nigel Dessau. Nigel and I worked together as far back as 1991, and he has produced a fine set of short, topic based video advice guides. I don’t agree with all of them, but they are a fantastic resource.

The very first 3 Minute Mentor episode was in fact, “How should I plan a career?” – My take of careers has been long documented in my now 15-year old, “Ways to measure progress” presentation, available in it’s original format on slideshare.net or in the 2012 format Technical and Professional careers I delivered at Texas A&M.

My approach has always been to set a long term goal, and then judge changes and opportunities against that goal. My email answer makes sense in that context.

This is a long term objective. As per the presentation(see above), you need to evaluate each and every job and assignment against a long term objective, depending on what you are aiming for long term, you may or may not decide to take a short term job. For example, I took an assignment in New York as a stepping stone to get here in Austin. I’d worked in NY before had had no desire to go back. However, equally it wasn’t clear how I would get assigned to Austin, so I took the NY job and worked on connections here[Austin] to create the opportunity to move to Austin

Next up, “How do you go about dealing with difficult people and company politics?”


Rob HirschfeldOpenStack DefCore Review [interview by Jason Baker]

I was interviewed about DefCore by Jason Baker of Red Hat as part of my participation in OSCON Open Cloud Day (speaking Monday 11:30am).  This is just one of fifteen in a series of speaker interviews covering everything from Docker to Girls in Tech.

This interview serves as a good review of DefCore so I’m reposting it here:

Without giving away too much, what are you discussing at OSCON? What drove the need for DefCore?

I’m going to walk through the impact of the OpenStack DefCore process in real terms for users and operators. I’ll talk about how the process works and how we hope it will make OpenStack users’ lives better. Our goal is to take steps towards interoperability between clouds.

DefCore grew out of a need to answer hard and high stakes questions around OpenStack. Questions like “is Swift required?” and “which parts of OpenStack do I have to ship?” have very serious implications for the OpenStack ecosystem.

It was impossible to reach consensus about these questions in regular board meetings so DefCore stepped back to base principles. We’ve been building up a process that helps us make decisions in a transparent way. That’s very important in an open source community because contributors and users want ground rules for engagement.

It seems like there has been a lot of discussion over the OpenStack listservs over what DefCore is and what it isn’t. What’s your definition?

First, DefCore applies only to commercial uses of the OpenStack name. There are different rules for the integrated code base and community activity. That’s the place of most confusion.

Basically, DefCore establishes the required minimum feature set for OpenStack products.

The longer version includes that it’s a board managed process that’s designed to be very transparent and objective. The long-term objective is to ensure that OpenStack clouds are interoperable in a measurable way and that we also encourage our vendor ecosystem to keep participating in upstream development and creation of tests.

A final important component of DefCore is that we are defending the OpenStack brand. While we want a vibrant ecosystem of vendors, we must first have a community that knows what OpenStack is and trusts that companies using our brand comply with a meaningful baseline.

Are there other open source projects out there using “designated sections” of code to define their product, or is this concept unique to OpenStack? What lessons do you think can be learned from other projects’ control (or lack thereof) of what must be included to retain the use of the project’s name?

I’m not aware of other projects using those exact words. We picked up ‘designated sections’ because the community felt that ‘plug-ins’ and ‘modules’ were too limited and generic. I think the term can be confusing, but it was the best we found.

If you consider designated sections to be plug-ins or modules, then there are other projects with similar concepts. Many successful open source projects (Eclipse, Linux, Samba) are functionally frameworks that have very robust extensibility. These projects encourage people to use their code base creatively and then give back some (not all) of their lessons learned in the form of code contributes. If the scope returning value to upstream is too broad then sharing back can become onerous and forking ensues.

All projects must work to find the right balance between collaborative areas (which have community overhead to join) and independent modules (which allow small teams to move quickly). From that perspective, I think the concept is very aligned with good engineering design principles.

The key goal is to help the technical and vendor communities know where it’s safe to offer alternatives and where they are expected to work in the upstream. In my opinion, designated sections foster innovation because they allow people to try new ideas and to target specialized use cases without having to fight about which parts get upstreamed.

What is it like to serve as a community elected OpenStack board member? Are there interests you hope to serve that are difference from the corporate board spots, or is that distinction even noticeable in practice?

It’s been like trying to row a dragon boat down class III rapids. There are a lot of people with oars in the water but we’re neither all rowing together nor able to fight the current. I do think the community members represent different interests than the sponsored seats but I also think the TC/board seats are different too. Each board member brings a distinct perspective based on their experience and interests. While those perspectives are shaped by their employment, I’m very happy to say that I do not see their corporate affiliation as a factor in their actions or decisions. I can think of specific cases where I’ve seen the opposite: board members have acted outside of their affiliation.

When you look back at how OpenStack has grown and developed over the past four years, what has been your biggest surprise?

Honestly, I’m surprised about how many wheels we’ve had to re-invent. I don’t know if it’s cultural or truly a need created by the size and scope of the project, but it seems like we’ve had to (re)create things that we could have leveraged.

What are you most excited about for the “K” release of OpenStack?

The addition of platform services like database as a Service, DNS as a Service, Firewall as a Service. I think these IaaS “adjacent” services are essential to completing the cloud infrastructure story.

Any final thoughts?

In DefCore, we’ve moved slowly and deliberately to ensure people have a chance to participate. We’ve also pushed some problems into the future so that we could resolve the central issues first. We need to community to speak up (either for or against) in order for us to accelerate: silence means we must pause for more input.


Rob HirschfeldBoot me up! out-of-band IPMI rocks then shuts up and waits

It’s hard to get excited about re-implementing functionality from v1 unless the v2 happens to also be freaking awesome.   It’s awesome because the OpenCrowbar architecture allows us to it “the right way” with real out-of-band controls against the open WSMAN APIs.

gangnam styleWith out-of-band control, we can easily turn systems on and off using OpenCrowbar orchestration.  This means that it’s now standard practice to power off nodes after discovery & inventory until they are ready for OS installation.  This is especially interesting because many servers RAID and BIOS can be configured out-of-band without powering on at all.

Frankly, Crowbar 1 (cutting edge in 2011) was a bit hacky.  All of the WSMAN control was done in-band but looped through a gateway on the admin server so we could access the out-of-band API.  We also used the vendor (Dell) tools instead of open API sets.

That means that OpenCrowbar hardware configuration is truly multi-vendor.  I’ve got Dell & SuperMicro servers booting and out-of-band managed.  Want more vendors?  I’ll give you my shipping address.

OpenCrowbar does this out of the box and in the open so that everyone can participate.  That’s how we solve this problem as an industry and start to cope with hardware snowflaking.

And this out-of-band management gets even more interesting…

Since we’re talking to servers out-of-band (without the server being “on”) we can configure systems before they are even booted for provisioning.  Since OpenCrowbar does not require a discovery boot, you could pre-populate all your configurations via the API and have the Disk and BIOS settings ready before they are even booted (for models like the Dell iDRAC where the BMCs start immediately on power connect).

Those are my favorite features, but there’s more to love:

  • the new design does not require network gateway (v1 did) between admin and bmc networks (which was a security issue)
  • the configuration will detect and preserves existing assigned IPs.  This is a big deal in lab configurations where you are reusing the same machines and have scripted remote consoles.
  • OpenCrowbar offers an API to turn machines on/off using the out-of-band BMC network.
  • The system detects if nodes have IPMI (VMs & containers do not) and skip configuration BUT still manage to have power control using SSH (and could use VM APIs in the future)
  • Of course, we automatically setup BMC network based on your desired configuration

 


Mark CathcartHow to stay relevant

I received an email from a colleague in one of the acquired companies, he asked among other things

What is the most effective way to influence or implement positive change at large companies

Rather than dump my entire email reply here, I thought I’d break it up into a few shorter posts.

Easy to say, not so easy to do. You have to demonstrate sustained track record of delivering on important projects. You have to make yourself relevant. How do you stay relevant? Start with tracking what is important to your boss, then meet deadlines; volunteer for hard projects; mentor; measure and report results; always be positive, the glass is always half full; work hard; volunteer more. Make yourself indispensable. When you think you’ve done that for your boss, move on, track what his/her boss thinks is important, lather, rinse, repeat.

Recently someone told me they couldn’t make progress because corporate “branding” was telling him that he had to deliver what was important to them. I asked who “they” was, he was evasive. This was useful as it showed he’d been beaten down by the system. There is no such person as Corporate Branding, it’s a team of people, managers and executives. They have a job and they have objectives. Getting beaten down by them just shows that he hadn’t thought it through and taken his case to the right people. Everything, yes, everything is fixable in a large company, you just have to decide its worth fixing and knowing that you can only do this in a positive forward looking way. Anything else requires people to admit they were wrong, who does that?

Some things are not worth fixing.


Rob Hirschfelda Ready State analogy: “roughed in” brings it Home for non-ops-nerds

I’ve been seeing great acceptance on the concept of ops Ready State.  Technologists from both ops and dev immediately understand the need to “draw a line in the sand” between system prep and installation.  We also admit that getting physical infrastructure to Ready State is largely taken for granted; however, it often takes multiple attempts to get it right and even small application changes can require a full system rebuild.

Since even small changes can redefine the ready state requirements, changing Ready State can feel like being told to tear down your house so you remodel the kitchen.

Foundation RawA friend asked me to explain “Ready State” in non-technical terms.  So far, the best analogy that I’ve found is when a house is “Roughed In.”  It’s helpful if you’ve ever been part of house construction but may not be universally accessible so I’ll explain.

Foundation PouredGetting to Rough In means that all of the basic infrastructure of the house is in place but nothing is finished.  The foundation is poured, the plumbing lines are placed, the electrical mains are ready, the roof on and the walls are up.  The house is being built according to architectural plans and major decisions like how many rooms there are and the function of the rooms (bathroom, kitchen, great room, etc).  For Ready State, that’s like having the servers racked and setup with Disk, BIOS, and network configured.

Framed OutWhile we’ve built a lot, rough in is a relatively early milestone in construction.  Even major items like type of roof, siding and windows can still be changed.  Speaking of windows, this is like installing an operating system in Ready State.  We want to consider this as a distinct milestone because there’s still room to make changes.  Once the roof and exteriors are added, it becomes much more disruptive and expensive to make.

Roughed InOnce the house is roughed in, the finishing work begins.  Almost nothing from roughed in will be visible to the people living in the house.  Like a Ready State setup, the users interact with what gets laid on top of the infrastructure.  For homes it’s the walls, counters, fixtures and following.  For operators, its applications like Hadoop, OpenStack or CloudFoundry.

Taking this analogy back to where we started, what if we could make rebuilding an entire house take just a day?!  In construction, that’s simply not practical; however, we’re getting to a place in Ops where automation makes it possible to reconstruct the infrastructure configuration much faster.

While we can’t re-pour the foundation (aka swap out physical gear) instantly, we should be able to build up from there to ready state in a much more repeatable way.


Ravikanth ChagantiWPC 2014 – One-click deployment of SharePoint Farm on Azure

At WPC 2014, Scott Gu announced several new capabilities in Azure and one such new capability is the templates available for ready deployment. Scott demonstrated creation of a SharePoint 2013 farm that can be highly available and demonstrated that we can customize the SQL and other settings. This is a great feature and I couldn’t…

Rob HirschfeldOpenStack DefCore Update & 7/16 Community Reviews

The OpenStack Board effort to define “what is core” for commercial use (aka DefCore).  I have blogged extensively about this topic and rely on you to review that material because this post focuses on updates from recent activity.

First, Please Join Our Community DefCore Reviews on 7/16!

We’re reviewing the current DefCore process & timeline then talking about the Advisory Havana Capabilities Matrix (decoder).

To support global access, there are TWO meetings (both will also be recorded):

  1. July 16, 8 am PDT / 1500 UTC
  2. July 16, 6 pm PDT / 0100 UTC July 17

Note: I’m presenting about DefCore at OSCON on 7/21 at 11:30!

We want community input!  The Board is going discuss and, hopefully, approve the matrix at our next meeting on 7/22.  After that, the Board will be focused on defining Designated Sections for Havana and Ice House (the TC is not owning that as previously expected).

The DefCore process is gaining momentum.  We’ve reached the point where there are tangible (yet still non-binding) results to review.  The Refstack efforts to collect community test results from running clouds is underway: the Core Matrix will be fed into Refstack to validate against the DefCore required capabilities.

Now is the time to make adjustments and corrections!  

In the next few months, we’re going to be locking in more and more of the process as we get ready to make it part of the OpenStack by-laws (see bottom of minutes).

If you cannot make these meetings, we still want to hear from you!  The most direct way to engage is via the DefCore mailing list but 1×1 email works too!  Your input is import to us!


Jason BocheThe VMworld US Session Builder Is Now Open

For those not hearing the news on Twitter, notice from VMware was email blasted this morning. I received mine at 9:03am CST.

Of the 455 sessions available, over 14% cover NSX and VSAN which were the two major themes at last year’s show. This is almost equal to the total number of vSphere sessions available this year.

Go go go!

Post from: boche.net - VMware Virtualization Evangelist

Copyright (c) 2010 Jason Boche. The contents of this post may not be reproduced or republished on another web page or web site without prior written permission.

The VMworld US Session Builder Is Now Open

Jason BocheYet another blog post about vSphere HA and PDL

If you ended up here searching for information on PDL or APD, your evening or weekend plans may be cancelled at this point and I’m sorry for you if that is the case. There are probably 101 or more online resources which discuss the interrelated vSphere storage topics of All Paths Down (known as APD), Permanent Device Loss (known as PDL), and vSphere High Availability (known as HA, and before dinosaurs roamed the Earth – DAS ). To put it in perspective, I’ve quickly pulled together a short list of resources below using Google. I’ve read most of them:

VMware KB: Permanent Device Loss (PDL) and All-Paths

VMware KB: PDL AutoRemove feature in vSphere 5.5

Handling the All Paths Down (APD) condition – VMware Blogs

vSphere 5.5. Storage Enhancements Part 9 – PDL

Permanent Device Loss (PDL) enhancements in vSphere 5.0

APD (All Paths Down) and PDL (Permanent Device Loss

vSphere Metro Storage Cluster solutions and PDL’s

vSphere Metro Stretched Cluster with vSphere 5.5 and PDL

Change in Permanent Device Loss (PDL) behavior for 5.1

VMware KB: PDL AutoRemove feature in vSphere 5.5

PDL AutoRemove – CormacHogan.com

How handle the APD issue in vSphere – vInfrastructure Blog

Interpreting SCSI sense codes in VMware ESXi and ESX

What’s New in VMware vSphere® 5.1 – Storage

vSphere configuration for handling APD/PDL – CloudXC

vSphere 5.1 Storage Enhancements – Part 4: All Paths Down

vSphere 5.5 nuggets: changes to disk – Yellow Bricks

ESXi host disk.terminateVMOnPDLDefault configuration

ESXi host VMkernel.Boot.terminateVMOnPDL configuration

vSphere HA in my opinion is a great feature. It has saved my back side more than once both in the office and at home. Several books have been more or less dedicated to the topic and yet it is so easy to use that an entire cluster and all of its running virtual machines can be protected with default parameters (common garden variety) with just two mouse clicks.

VMware’s roots began with compute virtualization so when HA was originally released in VMware Virtual Infrastructure 3 (one major revision before it became the vSphere platform known today), the bits licensed and borrowed from Legato Automated Availability Manager (AAM) were designed to protect against marginal but historically documented amounts of x86 hardware failure thereby reducing unplanned downtime and loss of virtualization capacity to a minimum. Basically if an ESX host yields to issues relating to CPU, memory, or network, VMs restart somewhere else in the cluster.

It wasn’t really until vSphere 5.0 that VMware began building in high availability for storage aside from legacy design components such as redundant fabrics, host bus adapters (HBAs), multipath I/O (MPIO), failback policies, and with vSphere 4.0 the pluggable storage architecture (PSA) although this is not to say that any of these design items are irrelevant today – quite the opposite.  vSphere 5.0 introduced Permanent Device Loss (PDL) which does a better job of handling unexpected loss of individual storage devices than APD solely did.  Subsequent vSphere 5.x revisions made further PDL improvements such as improving support for single LUN:single target arrays in 5.1. In short, the new vSphere HA re-write (Legato served its purpose and is gone now) covers much of the storage gap such that in the event of certain storage related failures, HA will restart virtual machines, vApps, services, and applications somewhere else – again to minimize unplanned downtime. Fundamentally, this works just like HA when a vSphere host tips over, but instead the storage tips over and HA is called to action. Note that HA can’t do much about an entire unfederated array failing – this is more about individual storage/host connectivity. Aside from gross negligence on the part of administrators, I believe the failure scenarios are more likely to resonate with non-uniform stretched or metro cluster designs. However, PDL can also occur in small intra datacenter designs as well.

I won’t go into much more detail about the story that has unfolded with APD and the new features in vSphere 5.x because it has already been documented many times over in some of the links above.  Let’s just say the folks starting out new with vSphere 5.1 and 5.5 had it better than myself and many others did dealing with APD and hostd going dark. However, the trade off for them is they are going to have to deal with Software Defined * a lot longer than I will.

Although I mentioned earlier that vSphere HA is extremely simple to configure, I did also mention that was with default options which cover a large majority of the host related failures.  Configuring HA to restart VMs automatically and with no user intervention in the event of a PDL condition in theory is just one configuration change for each host in the cluster. Where to configure depends on the version of vSphere host.

vSphere 5.0u1+/5.1: Disk.terminateVMOnPDLDefault = True (/etc/vmware/settings file on each host)

or

vSphere 5.5+: VMkernel.Boot.terminateVMOnPDL = yes (advanced setting on each host, check the box)

One thing about this configuration that had me chasing sense codes in vmkernel logs recently was lack of clarity on the required host reboot. That’s mainly what prompted this article – I normally don’t cover something that has already been covered well by other writers unless there is something I can add, something was missed, or it has caused me personal pain (my blog + SEO = helps ensure I don’t suffer from the same problems twice). In all of the online articles I had read about these configurations, none mentioned a host reboot requirement and it’s not apparent that a host reboot is required until PDL actually happens and automatic VM restart via HA actually does not. The vSphere 5.5 documentation calls it out. Go figure. I’ll admit that sometimes I will refer to a reputable vMcBlog before the product documentation. So let the search engine results show: when configuring  VMkernel.Boot.terminateVMOnPDL a host reboot or restart is required. VMware KB 1038578 also calls out that as of vSphere 5.5 you must reboot the host for VMkernel.boot configuration changes to take effect. I’m not a big fan of HA or any configuration being written into VMkernel.boot requiring host or VSAN node performance/capacity outages when a change is made but that is VMware Engineering’s decision and I’m sure there is a relevant reason for it aside from wanting more operational parity with the Windows operating system.

I’ll also reiterate Duncan Epping’s recommendation that if you’re already licensed for HA and have made the design and operational decision to allow HA to restart VMs in the event of a host failure, then the above configuration should be made on all vSphere clustered hosts, whether they are part of a stretched cluster or not to protect against storage related failures. A PDL can be broken down to one host losing all available paths to a LUN. By not making the HA configuration change above, a storage related failure results in user intervention required to recover all of the virtual machines on the host tied to the failed device.

Lastly, it is mentioned in some of the links above but if this is your first reading on the subject, please allow me to point out that the configuration setting above is for Permanent Device Loss (PDL) conditions only. It is not meant to handle an APD event. The reason behind this is that the storage array is required to send a proper sense code to the vSphere host indicating a PDL condition.  If the entire array fails or is powered off ungracefully taking down all available paths to storage, it has no chance to send PDL sense codes to vSphere.  This would constitute an indefinite All Paths Down or APD condition where vSphere knows storage is unavailable, but is unsure about its return. PDL was designed to answer that question for vSphere, rather than let vSphere go on wondering about it for a long period of time, thus squandering any opportunities to proactively do something about it.

In reality there are a few other configuration settings (again documented well in the links above) which fine tunes HA more precisely. You’ll almost always want to add these as well.

vSphere 5.0u1+: das.maskCleanShutdownEnabled = True (Cluster advanced options) – this is an accompanying configuration that helps vSphere HA distinguish between VMs that were once powered on and should be restarted versus VMs that were already powered off when a PDL occurred therefore these are VMs that don’t need to be and more importantly probably should not be restarted.

vSphere 5.5+: Disk.AutoremoveOnPDL = 0 (advanced setting on each host) – This is a configuration I first read about on Duncan’s blog where he recommends that the value be changed from the default of enabled to disabled so that a device is not automatically removed if it enters a PDL state. Aside from LUN number limits a vSphere host can handle (255), VMware refers to a few cases where the stock configuration of automatically removing a PDL device may be desired although VMware doesn’t really specifically call out each circumstance aside from problems arising from hosts attempting to send I/O to a dead device. There may be more to come on this in the future but for now preventing the removal may save in fabric rescan time down the road if you can afford the LUN number expended. It will also serve as a good visual indicator in the vSphere Client that there is a problematic datastore that needs to be dealt with in case the PDL automation restarts VMs with nobody noticing the event has occurred. If there are templates or powered off VMs that were not evacuated by HA, the broken datastore will visually persist anyway.

That’s the short list of configuration changes to make for HA VM restart.  There’s actually a few more here. For instance, fine grained HA handling can be coordinated on a per-VM basis by modifying the advanced virtual machine option disk.terminateVMOnPDLDefault configuration for each VM. Or scsi#:#.terminateVMOnPDL to fine tune HA on a per virtual disk basis for each VM. I’m definitely not recommending touching if the situation does not call for it.

In a stock vSphere configuration with VMkernel.Boot.terminateVMOnPDL = no configured (or unintentionally misconfigured I suppose), the following events occur for an impacted virtual machine:

  1. PDL event occurs, sense codes are received and vSphere correctly identifies the PDL condition on the supporting datastore. A question is raised by vSphere for each impacted virtual machine to Retry I/O or Cancel I/O.
  2. Stop. Nothing else happens until each of the questions above are answered with administrator intervention. Answering Retry without the PDL datastore coming back online or without hot removing the impacted virtual disk (in most cases the .vmx will be impacted anyway and hot removing disks is next to pointless) sends the VM to hell pretty much. Answering Cancel allows HA to proceed with powering off the VM and restarting it on another host with access to the device which went PDL on the original host.

In a modified vSphere configuration with VMkernel.Boot.terminateVMOnPDL = yes configured, the following events occur for an impacted virtual machine:

  1. PDL event occurs, sense codes are received and vSphere correctly identifies the PDL condition on the supporting datastore. A question is raised by vSphere for each impacted virtual machine to Retry I/O or Cancel I/O.
  2. Due to VMkernel.Boot.terminateVMOnPDL = yes vSphere HA automatically and effectively answers Cancel for each impacted VM with a pending question. Again, if the hosts aren’t rebooted after the VMkernel.Boot.terminateVMOnPDL = yes configuration change, this step will mimic the previous scenario essentially resulting in failure to automatically carry out the desired tasks.
  3. Each VM is powered off.
  4. Each VM is powered on.

I’ll note in the VM Event examples above, leveraging the power of Snagit I’ve cut out some of the noise about alarms triggering gray and green, resource allocations changing, etc.

For completeness, following is a list of the PDL sense codes vSphere is looking for from the supported storage array:

SCSI sense code Description
H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x25 0x0 LOGICAL UNIT NOT SUPPORTED
H:0x0 D:0x2 P:0x0 Valid sense data: 0x4 0x4c 0x0 LOGICAL UNIT FAILED SELF-CONFIGURATION
H:0x0 D:0x2 P:0x0 Valid sense data: 0x4 0x3e 0x3 LOGICAL UNIT FAILED SELF-TEST
H:0x0 D:0x2 P:0x0 Valid sense data: 0x4 0x3e 0x1 LOGICAL UNIT FAILURE

Two isolated examples of PDL taking place seen in /var/log/vmkernel.log:

Example 1:

2014-07-13T20:47:03.398Z cpu13:33486)NMP: nmp_ThrottleLogForDevice:2321: Cmd 0x2a (0x4136803b8b80, 32789) to dev “naa.6000d31000ebf600000000000000006c” on path “vmhba2:C0:T0:L30″ Failed: H:0×0 D:0×2 P:0×0 Valid sense data: 0×6 0x3f 0xe. Act:EVAL
2014-07-13T20:47:03.398Z cpu13:33486)ScsiDeviceIO: 2324: Cmd(0x4136803b8b80) 0x2a, CmdSN 0xe1 from world 32789 to dev “naa.6000d31000ebf600000000000000006c” failed H:0×0 D:0×2 P:0×0 Valid sense data: 0×6 0x3f 0xe.
2014-07-13T20:47:03.398Z cpu13:33486)NMP: nmp_ThrottleLogForDevice:2321: Cmd 0x2a (0x413682595b80, 32789) to dev “naa.6000d31000ebf600000000000000007c” on path “vmhba2:C0:T0:L2″ Failed: H:0×0 D:0×2 P:0×0 Valid sense data: 0×5 0×25 0×0. Act:FAILOVER

Example 2:

2014-07-14T00:43:49.720Z cpu4:32994)ScsiDeviceIO: 2337: Cmd(0x412e82f11380) 0×85, CmdSN 0×33 from world 34316 to dev “naa.600508b1001c6e17d603184d3555bf8d” failed H:0×0 D:0×2 P:0×0 Valid sense data: 0×5 0×20 0×0.
2014-07-14T00:43:49.731Z cpu4:32994)ScsiDeviceIO: 2337: Cmd(0x412e82f11380) 0x4d, CmdSN 0×34 from world 34316 to dev “naa.600508b1001c6e17d603184d3555bf8d” failed H:0×0 D:0×2 P:0×0 Valid sense data: 0×5 0×20 0×0.
2014-07-14T00:43:49.732Z cpu4:32994)ScsiDeviceIO: 2337: Cmd(0x412e82f11380) 0x1a, CmdSN 0×35 from world 34316 to dev “naa.600508b1001c6e17d603184d3555bf8d” failed H:0×0 D:0×2 P:0×0 Valid sense data: 0×5 0×24 0×0.
2014-07-14T00:48:03.398Z cpu10:33484)NMP: nmp_ThrottleLogForDevice:2321: Cmd 0x2a (0x4136823b2dc0, 32789) to dev “naa.60060160f824270012f6aa422e0ae411″ on path “vmhba1:C0:T2:L40″ Failed: H:0×0 D:0×2 P:0×0 Valid sense data: 0×5 0×25 0×0. Act:FAILOVER

In no particular order, I want to thank Duncan, Paudie, Cormac, Mohammed, Josh, Adam, Niran, and MAN1$H for providing some help on this last week.

By the way, don’t name your virtual machines or datastores PDL. It’s bad karma.

Post from: boche.net - VMware Virtualization Evangelist

Copyright (c) 2010 Jason Boche. The contents of this post may not be reproduced or republished on another web page or web site without prior written permission.

Yet another blog post about vSphere HA and PDL

Ravikanth ChagantiWindows Azure Pack: Infrastructure as a service – MVA

If you are in the Microsoft Virtualization, System Center or Cloud expertise, there is a Microsoft Virtual Academy event planned for Windows Azure Pack. This event is scheduled to happen on July 16th and 17th. IT Pros, you know that enterprises desire the flexibility and affordability of the cloud, and service providers want the ability…

Jason BocheVMware vCenter Operations Manager Essentials

A new vSphere book has just arrived and has been added to my library. The book’s title is VMware vCenter Operations Manager Essentials which was authored by Technical Virtualization Architect and vExpert Lauren Malhoit (@malhoit) with reviews from Michael Poore, Mike Preston, and Chris Wahl.

I ordered this book while attending Dell User Forum a few weeks ago where I did some breakout session speaking on vC Ops and the new Dell Storage adapters for vC Ops.

“This book is written for administrators, engineers, and architects of VMware vSphere as well as those who have or are interested in purchasing the vCenter Operations Manager Suite. It will particularly help administrators who are hoping to use vCenter Operations Manager to optimize their VMware environments as well as quickly troubleshoot both long-term and short-term issues.”

Skimming through the chapter list covering 236 pages, it looks like it’s going to be a pretty good read.

Chapter 1: Introduction to vCenter Operations Manager

Chapter 2: Installing vCenter Operations Manager

Chapter 3: Dashboards and Badges (badges?…. had to be said)

Chapter 4: Troubleshooting Our Virtual Environment with vCenter Operations Manager

Chapter 5: Capacity Planning with vCenter Operations Manager

Chapter 6: Reports

Chapter 7: vCenter Configuration Manager

Chapter 8: Log Insight

Chapter 9: VMware Horizon View Integration with vCenter Operations Manager

Chapter 10: vCenter Infrastructure Navigator

Chapter 11: EMC Storage Analytics

Why did I pick up this book? vC Ops is extremely powerful and it has a bit of a learning curve to it. This is what resonated with me the most when I first began using the product. Over time, vCenter has become an integral component in VMware vSphere virtualized datacenters and it will continue to do so as more and more applications and services are integrated with and become dependent on it. vC Ops ties together many datacenter infrastructure pieces and allows virtualization, IaaS, cloud computing, and VDI to be delivered more intelligently. I would like to learn more about vC Ops and hopefully pick up some helpful tips on building custom dashboards with stock and add-on adapters/collectors as well as custom widgets

Post from: boche.net - VMware Virtualization Evangelist

Copyright (c) 2010 Jason Boche. The contents of this post may not be reproduced or republished on another web page or web site without prior written permission.

VMware vCenter Operations Manager Essentials

Mark CathcartProperty Tax, Travis county, Austin

There are a number of threads running through the posts on this blog about Austin and Texas. One key aspect of them is how things get paid for, and what gets paid for. Since Texas(bigger than Germany, approx. 7/8 the Population of Germany) has no income tax, as boasts about it’s low corporate taxes, apart from the 6.25% sales tax, property tax is key.

Property tax, the valuation and assessment of properties has become both increasingly complex, and for many long term residents, unaffordable. Among those arguing for greater density in Austin, there are calls for better transportation, more affordable rents etc.

The fact that Caesar Chavez currently has more high rise development than any other street in America, added to all the stories and blatant self promotion that Austin in #1 in this, no.1 in that, highest ranked for everything has lead to a typical Texas business friendly “gold rush” over the last 10-years, eight of which have been presided over by rail-or-fail Mayor Leffingwell.

All this has lead to massive gentrification of the core and central neighborhoods. Development and re-development in itself isn’t evil, it’s the nature of the development and the context it’s done in. However, when that development is done by forcing people who’ve spent their adult lives in a neighborhood out, because they can no longer afford among other things, the property taxes, thats just plain wrong and bordering on financial exploitation.

Imagine, you were a hard working manual worker, domestic, construction, yard, office, transportation, etc. in the late 1970′s in Austin. A very different place. South of the river was mostly for the working poor, as a legacy of the cities 1920′s policies, east of I35 for the racially segregated families. You’ve struggled in the heat with no central a/c, poor transport options, typical inner city problems. Your do what you can to plan for your retirement, depend on federally provided health programs and finally you get to retire in your late 60′s.

Then along comes the modern, gold rush Austin. A few people, often like me, move into your neighborhood because we want something authentic, real rather than remote, urban sprawl neighborhoods. Sooner or later, business spots the opportunity to take advantage of the low property prices, the neighborhood starts to pick-up and before you know it, your meager retirement can’t afford the property taxes that are now annually more than the price of your house from 40-years ago.

Few people seem to understand the emotional, and stressful impact of having to even consider moving, let alone being financially relocated in your reclining years. It changes virtually every aspect of your life. One possible solution to this, and some of Austins other problems is the “accessory dwelling”. I’ll return to ADU’s in a subsequent post, it isn’t a simple as just making then easier to get permitted an built though.

With the City of Austin, typically for Texas, siding with business and refusing to challenge commercial property tax appraisals, the burden falls on private homes. That’s why it is important for everyone to protest their appraisals until the existing system changes.

If you don’t understand how the system works, and more importantly, why you need to protest, the Austin Monitor has a great discussion on soundcloud.

While I can see my obvious bias, as I said in my July 4th post, I for one would rather opt for a state income tax, even if that meant I would end up paying more tax. That though is very unlikely to ever happen in Texas, and so until then we have to push back and get to a point where businesses and commercial property owners pay their fair share.

Why bias? Well, I’m in my 50′s, I won’t be working for ever, and my income will then drop off sharply. At least as it currently stands, I plan to stay were I am.


Jason BocheDrive-through Automation with PowerGUI

One of the interesting aspects of shared infrastructure is stumbling across configuration changes made by others who share responsibility in managing the shared environment. This is often the case in the lab but I’ve also seen it in every production environment I’ve supported to date as well. I’m not pointing any fingers – my back yard is by no means immaculate. Moreover, this bit is regarding automation, not placing blame (Note the former is productive while the latter is not).

Case in point this evening when I was attempting to perform a simple remediation of a vSphere 5.1 four-host cluster via Update Manager. I verified the patches and cluster configuration, hit the remediate button in VUM, and left the office.  VUM, DRS, and vMotion does the heavy lifting. I’ve done it a thousand times or more in the past in environments 100x this size.

I wrap up my 5pm appointment on the way home from the office, have dinner with the family, and VPN into the network to verify all the work was done. Except nothing had been accomplished. Remediation on the cluster was a failure.  Looking at the VUM logs reveals 75% of the hosts being remediated contain virtual machines with attached devices preventing VUM, DRS, and vMotion from carrying out the remediation.

Obviously I know how to solve this problem but to manually check and strip every VM of it’s offending device is going to take way too long. I know what I’m supposed to do here. I can hear the voices in my head of PowerShell gurus Alan, Luc, etc. saying over and over the well known automation battle cry “anything repeated more than once should be scripted!

That’s all well and good, I completely get it, but I’m in that all too familiar place of:

  1. Carrying out the manual tasks will take 30 minutes.
  2. Authoring, finding, testing a suitable PowerShell/PowerCLI script to automate will also take 30 minutes, probably more.
  3. FML, I didn’t budget time for either of the above.

There is a middle ground. I view it as drive-through efficiency automation. It’s call PowerGUI and it has been around almost forever. In fact, it comes from Quest which my employer now owns. And I’ve already got it along with the PowerPacks and Plug-ins installed on my new Dell Precision M4800 laptop. Establishing a PowerGUI session and authenticating with my current infrastructure couldn’t be easier. From the legacy vSphere Client, choose the Plug-ins pull down, PowerGUI Administrative Console.

The VMware vSphere Management PowerPack ships stock with not only the VM query to find all VMs with offensive devices attached, but also a method to highlight all the VMs and Disconnect.

Depending on the type of device connect to the virtual machines, VUM may also be able to handle the issue as it has the native ability to Disable any removable media devices connect to the virtual machines on the host. In this case, the problem is solved with automation (I won’t get beat up on Twitter) and free community (now Dell) automation tools. Remediation completed.

RVTools (current version 3.6) also has identical functionality to quickly locate and disconnect various devices across a virtual datacenter.  Click on the image below to read more about RVTools.

Click on the image below to read more about PowerGUI.

Post from: boche.net - VMware Virtualization Evangelist

Copyright (c) 2010 Jason Boche. The contents of this post may not be reproduced or republished on another web page or web site without prior written permission.

Drive-through Automation with PowerGUI

Ravikanth ChagantiAzure Automation and PowerShell at PS Bangalore User Group

I will speaking about Azure Automation at the upcoming PS Bangalore User Group (PSBUG) meeting on July 19th, 2014. Also, the theme of this user group is to talk about PowerShell in the context of Azure and System Center. We have our newly awarded PowerShell MVP, Deepak, joining us for a session on System Center…

Barton GeorgeFindings from 451′s DevOps study — DevOps Days Austin

Today we come to the final interview from DevOps Days Austin.  I began the series with an interview with Andrew Clay Shafer who gave the first-day keynote.  Today I close, with perfect symmetry, with Michael Cote of 451 Research, who gave the keynote on the second day.

In his keynote, posted below, Cote presented findings from a study 451 did on DevOps usage.  I caught up with Cote to learn more.  Take a listen.

Some of the ground Cote covers:

  • Tracking tool usage as a proxy for DevOps
  • How they focused their study on companies outside of technology
  • What they found and given that, what advice would they give to
    1. IT
    2. Vendors in this space
    3. Investors
  • How Cote would advise a mainstream CIO looking to get into DevOps and set a strategy

 

Extra-credit reading

Pau for now…


Mark CathcartInternet security < Whose risk?

In my professional life I’m acutely aware of the demands of computer and software security, see this post from yesterday on my tech blog cathcam.wordpress.com as an example of things I’m currently involved in. This post though is prep for my call tomorrow with my UK Bank, FirstDirect, a division of global banking conglomerate HSBC. It made me wonder, who are they protecting, me or them?

The answer is obviously them…

I don’t use my UK bank account much, I don’t have any investments, it’s a small rainy day fund that I use to sponsor friends and family in worthy endeavors, to pay UK Credit card an other bills to avoid international banking/finance rip-off charges, like when I send flowers to my Mum on Mothers day.

Today I finally had time to set-up a payment for my lifetime membership of the BCS, The British Computer Society(*). As usual I went to the FirstDirect banking URL, put in my online banking ID; answered correctly the password question which asks for three randomly chosen letters from your password; finally I correctly answered my secret question.

secure key options

Do not pass go, do not collect $100

Instead of getting logged in, I was presented with the following. This forced me to chose one of three options.

Over a 100 apps, none of them First Direct

Over a 100 apps, none of them First Direct

  1. Get their Secure Key App
  2. FirstDirect send me via snailmail a random key generator
  3. Login to online banking with basically “read only” capabilities
bankingotg

I’m looking forward to having this explained

The only real option was 1.I went to install the app,first I had a hard time finding it. FirstDirect don’t provide a direct link from their website, they suggest searching for banking on the go in the iTunes and Play stores, I did. It returned over 100 results, none of them obviously FirstDirect. So I asked Google…

No go, it’s aparently the FirstDirect app is incompatible with any of the four actual devices I own, let alone the don’t have a browser/PC version, which frankly is a nonsense.

I’m guessing and open to be proven wrong that the app isn’t incompatible but it actually requires a UK provider IMEI number or similar to register with. Given that doesn’t work and options 2. and 3. were not viable, I picked up the phone and called. They won’t accept Skype calls, so that was an international call at my cost.

The conversation went something like this… security questions… except I couldn’t remember my memorable date. All I could remember about my memorable date was that I’d forgotten it once before, why write it down? Did I have my debit card with me? No why would I, I’m at work in the USA where I live, I don’t need it here.

So, after a short but polite rant, I got put through to supervisor, who called me back, we went through all my security questions again, I took a guess at my date and surprisingly got it right. She asked how she could help, I told her, I said I can’t be the only non-UK customer, she agreed, someone from overseas banking is going to call me.

(*)Interestingly, this all came about because the BCS doesn’t have an online system capable of accepting payments for lifetime memberships. This caused me to scratch my head and wonder, given I was the lead architect for the UK National Westminster Banks Internet Banking System in 1998/9, and worked on the protocol behind Chemical Banks Pronto home banking system in 1983, as much as everyone marvels at technology today, we are really going backwards, not forwards.

What a nonsense.


Rob HirschfeldSDN’s got Blind Spots! What are these Projects Ignoring? [Guest Post by Scott Jensen]

Scott Jensen returns as a guest poster about SDN!  I’m delighted to share his pointed insights that expand on previous 2 Part serieS about NFV and SDN.  I especially like his Rumsfeldian “unknowable workloads”

In my [Scott's] last post, I talked about why SDN is important in cloud environments; however, I’d like to challenge the underlying assumption that SDN cures all ops problems.

SDN implementations which I have looked at make the following base assumption about the physical network.  From the OpenContrails documentation:

The role of the physical underlay network is to provide an “IP fabric” – its responsibility is to provide unicast IP connectivity from any physical device (server, storage device, router, or switch) to any other physical device. An ideal underlay network provides uniform low-latency, non-blocking, high-bandwidth connectivity from any point in the network to any other point in the network.

The basic idea is to build an overlay network on top of the physical network in order to utilize a variety of protocols (Netflow, VLAN, VXLAN, MPLS etc.) and build the networking infrastructure which is needed by the applications and more importantly allow the applications to modify this virtual infrastructure to build the constructs that they need to operate correctly.

All well and good; however, what about the Physical Networks?

Under Provisioned / FunnyEarth.comThat is where you will run into bandwidth issues, QOS issues, latency differences and where the rubber really meets the road.  Ignoring the physical networks configuration can (and probably will) cause the entire system to perform poorly.

Does it make sense to just assume that you have uniform low latency connectivity to all points in the network?  In many cases, it does not.  For example:

  • Accesses to storage arrays have a different traffic pattern than a distributed storage system.
  • Compute resources which are used to house VMs which are running web applications are different than those which run database applications.
  • Some applications are specifically sensitive to certain networking issues such as available bandwidth, Jitter, Latency and so forth.
  • Where others will perform actions over the network at certain times of the day but then will not require the network resources for the rest of the day.  Classic examples of this are system backups or replication events.

Over Provisioned / zilya.netIf the infrastructure you are trying to implement is truly unknown as to how it will be utilized then you may have no choice than to over-provision the physical network.  In building a public cloud, the users will run whichever application they wish it may not be possible to engineer the appropriate traffic patterns.

This unknowable workload is exactly what these types of SDN projects are trying to target!

When designing these systems you do have a good idea of how it will be utilized or at least how specific portions of the system will be utilized and you need to account for that when building up the physical network under the SDN.

It is my belief that SDN applications should not just create an overlay.  That is part of the story, but should also take into account the physical infrastructure and assist with modifying the configuration of the Physical devices.  This balance achieves the best use of the network for both the applications which are running in the environment AND for the systems which they run on or rely upon for their operations.

Correctly ProvisionedWe need to reframe our thinking about SDN because we cannot just keep assuming that the speeds of the network will follow Moore’s Law and that you can assume that the Network is an unlimited resource.


Barton GeorgeDell Cloud Manager and Customer Transformation– DevOps Days Austin

Here is my penultimate post from DevOps Days Austin.  Today’s interview features Vann Orton, a Dell Sales Engineer for Dell Cloud Manager.  I chatted with Vann about the customers hes been visiting out in the field and what he’s seeing.

Some of the ground Vann covers

  • What’s Dell Cloud manager do and what pains does it address for customers
  • How Vann used Chef to connect Dell Cloud Manager and Foglight
  • What customers are facing as they look to implement cloud and how he shares Dell’s learning’s from implementing our own cloud.
  • How the conversation evolves into the higher order concern regarding business transformation and shifting to a services model.

Still to come: last but not least: Cote’s DevOps Days keynote.

Extra-credit reading

Pau for now…


Mark CathcartOpenSSL and the Linux Foundation

Former colleague and noted open source advocate Simon Phipps recently reblogged to his webmink blog a piece that was originally written for meshedinsights.com

I committed Dell to support the Linux Foundation Converged Infrastructure Initiative (CII) and attended a recent day long board meeting with other members to discuss next steps. I’m sure you understand Simon, but for the benefit of readers here are just two important clarifications.

By joining the Linux Foundation CII initiative, your company can contribute to helping fund developers of OpenSSL and similar technologies directly through Linux Foundation Fellowships. This is in effect the same as you(Simon) are suggesting, having companies hire experts . The big difference is, the Linux Foundation helps the developers stay independent and removes them from the current need to fund their work through the (for profit) OpenSSL Software Foundation (OSF). They also remain independent of a large company controlling interest.

Any expansion of the OpenSSL team depends on the team itself being willing and able to grow the team. We need to be mindful of Brooks mythical man month. Having experts outside the team producing fixes and updates faster than they can be consumed(reviewed, tested, verified, packaged and shipped) just creates a fork, if not adopted by the core.

I’m hopeful that this approach will pay off. The team need to produce at least an abstract roadmap for bug fix adoption, code cleanup and features, and I look forwarding to seeing this. The Linux Foundation CII initiative is not limited to OpenSSL, but that is clearly the first item on the list.

Mark CathcartHappy 4th to all my US Readers…

My yearly token protest

My yearly token protest

Happy 4th of July. This was my token yearly protest at 6:30am this morning, throwing a tea bag into Lake Austin from the Congress (bat) bridge. No taxation without representation!

The reality is I pay little tax in Texas, not counting what the State takes from the federal government. However, I for one would be willing to pay state income tax if it helped fix the deep inequalities in the property tax, which have arisen from people like me moving to Austin, and driving huge increases in property taxes. The state of Travis county property taxes is in itself deeply unjust for those that have lived through the lat 30-years in their same houses and now find assessments leaping up yearly by the maximum allowed. My property tax appeal will be heard in August.

Not withstanding my complaints and attempts here to understand the massive bias to big business in Texas, and the unjust social impact that regulation has on minorities, and more recently, women. I really like it here. Happy 4th!


Mark CathcartAbbotts Texas Miracle

This week Attorney General and Republican gubernatorial candidate Greg Abbott continued to demonstrate that the Texas Miracle is based only on smoke and mirrors.

Zap, Pow, Sock it to 'em Abbott

Zap, Pow, Sock it to ‘em Abbott

First up, Abbott claimed victory over the evil empire, the Federal Governments’ Environmental Protection Agency. Abbott has time and time again sued the EPA to try to get relief for Texas based businesses, claiming almost everything except the dog at their homework. The only thing Abbott hasn’t denied is that Texas is the worst state when it comes to air pollution, and given it’s size and position, that pollution is a major contributor to US pollution and to pollution in other states. But, hey, apparently that’s too bad as the regulations would be too costly for Texas business to implement.

The truth is that Abbott won a battle to save small businesses from implementing these regulations, but lost the war, the coal plants and other major facilities will have to implement them. The EDF ha a different perspective but comes to the same conclusion.

Meanwhile, Abbott(“What I really do for fun is I go into the office, [and] I sue the Obama administration.”) has been explaining the unexplainable, back-peddling on his order to restrict access to the hazardous Chemicals list. As posted last week “The Texas Freedom Illusion“, Abbott confirmed the ban of releasing information to the public as Tier II reports in the 1986 Emergancy Planning and Community Right to Know Act (EPCRA).

Well it turns out, he’s explained his position. You, yes, you the people, have not lost your right to know under the EPCRA. If you want to know, apparently all you have to do is visit the plants or write them, and ask. Instead of letting concerned citizens check the state database, where businesses are required to register, the State is pushing handling costs on the business. The Daily KOS has a great piece on this, describing Abbotts remarks as “jaw dropping”. < Zap pow!

It can’t be because it’s more secure that way.because it sure isn’t anymore secure. I’m sure the terrorists would never think of that, after all, they didn’t think of taking private flying lessons pre-9/11… when they couldn’t get trained by the Government.

Meanwhile, Abbott has also been re-confirming that the Texas Miracle doesn’t come with workers compensation insurance, the only state in America to do so. The Texas Tribune this week published a damning report into the cost and effect of this on workers. For as little as $1.38, businesses could provide workers comp. but like that EPA cost, thats too much of a burden. The downside of this, i workers getting hurt, seriously hurt often have no medical coverage, that means you are and I are picking up the tab.

So, lets summarize. Abbott is running for Governor. He is

  • Not prepared to require businesses meet the same emissions standards they are elsewhere in the USA
  • Not prepared to require Workers Comp. insurance
  • Not prepared to provide citizens access to data the State has on dangerous chemical storage
  • Continues to sue the Presidents Administration, costing hundreds of thousands of tax payer dollars, for no real purpose and little result

I’ll be protesting in the morning, Taxation with no representation, I can’t even vote for someone else, let alone against him. Zap Pow – Robin’ the people to pay for business.


Daniel MorrisCompellent and EqualLogic best practices for the first half of 2014

A load of new technical documents, white papers and best practices have been released for Dell Storage in the last 6 months. Here they are in no specific order, mainly because I am pasting this from an internal email :) Scroll down for the EqualLogic sections.

Compellent Publications

Compellent SC4000 Storage Controller

Dell Compellent Storage Center Management Pack 3.0 for Microsoft SCOM 2012/R2 Best Practices

This document provides best practices and step-by-step procedures for installing Management Pack 3.0, configuring connectivity between SCOM and Storage Center, and troubleshooting tips

Microsoft SQL Server Best Practices with Dell Compellent Storage Center (refresh)

This document describes the best practices for running Microsoft SQL Server with a Dell Compellent Storage Center.

Introduction to OpenStack with Dell Compellent Storage Center

This paper introduces the integration and connectivity of OpenStack with Dell Compellent storage.

Dell Compellent FS8600 Snapshots and Cloning Best Practices

This paper discusses in depth the snapshot and volume clone data protection capability that is built into FluidFS. The paper reviews the overall FluidFS data protection capabilities and focuses on FluidFS redirect-on-write snapshots, including snapshot sizing, management and monitoring, performance, and typical use cases.

Dell Compellent FS8600 Thin Provisioning Best Practices

This paper discusses the thin provisioning capability built into Dell Fluid File System (FluidFS) version 3, including management, monitoring and typical use cases.

Best Practices for deploying SharePoint 2013 utilizing Dell Compellent Storage Center

This paper introduces the integration and connectivity of OpenStack with Dell Compellent storage.

Dell Compellent 6.5 SED Reference Architecture & Best Practices

This document focuses on Secure Data and Self-Encrypting Drives (SED) and contains a Reference Architecture and Best Practices

Dell Compellent SC4020 v6.5 10000-user Exchange 2013 Mailbox Resiliency Solution with 900GB drives

This document provides information on Dell Compellent’s new SC4020 storage solution for Microsoft Exchange Server, based the Microsoft Exchange Solution Reviewed Program (ESRP) – Storage program.

Dell Compellent SC4020 v6.5 4500-user Exchange 2013 Mailbox Resiliency Solution with 1TB drives

This document provides information on Dell Compellent’s new SC4020 storage solution for Microsoft Exchange Server, based the Microsoft Exchange Solution Reviewed Program (ESRP) – Storage program.

Dell Compellent Storage Center 6.5 Multi-VLAN Tagging Solution Brief

This solution brief introduces the new Multi-VLAN tagging feature in Storage Center 6.5.

Windows Server 2012 R2 Best Practices for Dell Compellent Storage Center (refresh)

This document provides an overview of Microsoft Windows Server 2012/R2 and introduces best practice guidelines when integrating Windows Server 2012/R2 with the Dell Compellent Storage Center.

Dell Compellent Storage Center Synchronous Replication and Live Volume Solutions Guide

This guide focuses on two main data protection and mobility features available in Dell Compellent Storage Center: Synchronous Replication and Live Volume.

Dell Compellent Live Volume and Synchronous Replication Demo Video

This video focuses on Live Volume on Storage Center 6.5.

Dell Compellent Storage Center 6.5.1 and Data Compression Feature Brief

This feature brief provides a high-level overview of how data compression works with the Dell Compellent Storage Center 6.5.1.

Dell Compellent Storage Center 6.5 Multi-VLAN Tagging Solution Brief

This solution brief introduces the new Multi-VLAN tagging feature in Storage Center 6.5.

Windows Server 2012 R2 Best Practices for Dell Compellent Storage Center (refresh)

This document provides an overview of Microsoft Windows Server 2012/R2 and introduces best practice guidelines when integrating Windows Server 2012/R2 with the Dell Compellent Storage Center.

VMware ESX(i) Best Practices for Dell Compellent Storage Center

This document provides configuration examples, tips, recommended settings, and other storage guidelines a user can follow while integrating VMware vSphere 5.x with the Dell Compellent Storage Center.

FS8600 with VMware vSphere Deployment and Configuration Best Practices

This paper outlines key points to consider throughout the design, deployment and ongoing maintenance of a vSphere solution with Dell’s FS8600 file-level storage. The topics included in this document provide the reader with fundamental knowledge and tools needed to make vital decisions to optimize the solution.

Running Oracle over NFS with the Dell Compellent FS8600 Scale-out File System

This guide focuses on the features and general best practices of the FS8600 when integrating it into an Oracle environment.

Citrix XenDesktop VDI with Dell Compellent SC8000 All-Flash Arrays for 3,000 Persistent Desktop Users

This paper highlights a 3,000 user persistent desktop VDI architecture using Citrix XenDesktop 7.5 with Citrix Machine Creation Services (MCS).

Citrix XenDesktop VDI with Dell Compellent SC4020 All-Flash Arrays for 1,800 Persistent Desktop Users

This paper highlights an 1,800 user persistent desktop VDI architecture using Citrix XenDesktop 7.5 with Citrix Machine Creation Services (MCS).

Best Practices for Configuring an FCoE Infrastructure with Dell Compellent and Cisco Nexus

This white paper describes use cases for implementing a point-to-point FCoE infrastructure, provides guidance for sizing a converged infrastructure, and provides best practices for designing, configuring and deploying an FCoE-only infrastructure

Best Practices for Securing Dell Compellent Storage Center

This technical paper serves as a guide to deploying a secure Compellent Storage Center SAN to prevent unauthorized access to administrative interfaces and to protect data at rest using self-encrypting drives.

FluidFS Antivirus Integration

This paper provides technical information about integrating Dell FluidFS based systems with one of the external Antivirus Dell technology partners and some best practices for deploying these solutions.

Migrating an Oracle VM Environment Across Heterogeneous Storage Arrays

This paper presents key steps and procedures for migrating an Oracle VM environment from Dell EqualLogic storage arrays to a Dell Compellent SAN. The migration solution discussed in this whitepaper is validated and approved jointly by Dell and Oracle engineering teams.

EqualLogic Publications

eql

 

Best Practices for Securing a Dell EqualLogic SAN

This technical paper serves as a guide to deploying a secure EqualLogic SAN to prevent unauthorized access to the PS Series administrative interfaces and to protect data at rest using self-encrypting drives and data in flight using IPsec network layer security.

Performance Baseline for Deploying Microsoft SQL Server 2012 OLTP Database Applications Using EqualLogic PS Series Hybrid Storage Arrays

This white paper includes the results of storage I/O performance tests executed using EqualLogic PS series hybrid arrays and illustrates how the unique automated tiering capability of PS6110XS arrays can help in addressing the unique challenges of a Microsoft SQL Server based OLTP database application.

Configuring a Dell EqualLogic SAN Infrastructure with Virtual Link Trunking (VLT)

There are many possibilities for configuring various topologies of storage networks. This white paper discusses a few typical topologies that are found when deploying Dell EqualLogic iSCSI SANs with Dell Networking (Force10) switches. This document also provides best practice guidance and design considerations for deploying EqualLogic iSCSI SANs within an Ethernet switching infrastructure that utilizes Dell Networking VLT.

Sizing and Best Practices for Deploying Oracle VM using Dell EqualLogic Hybrid Arrays

This paper presents the key guidelines and best practices for deploying Oracle VM with EqualLogic hybrid arrays.  It also highlights some of the key configuration settings to achieve optimal performance and compares performance of virtualized and non-virtualized Oracle database solutions.

Sizing and Best Practices for Deploying Microsoft Exchange Server 2013 on VMware vSphere and Dell EqualLogic PS6110E Arrays

The solution presented in this paper address some of the most relevant variables to take into consideration when planning for the deployment of a virtualized Exchange Server 2013 back-end solution in conjunction with an EqualLogic PS6110E array with 4TB drives.

Using Dell EqualLogic Storage with Failover Clusters and Hyper-V

This deployment and configuration guide details using Dell EqualLogic storage with Microsoft Windows Server 2012 R2 Failover Clusters and Hyper-V, including configuration options and recommendations for servers, storage and networking.

Best Practices and Sizing Guidelines for Transaction Processing Applications with Microsoft SQL Server 2012 using EqualLogic PS Series Storage Arrays

The objective of this paper is to present the results of SQL Server I/O performance tests conducted using EqualLogic PS6210XS hybrid arrays. It also provides sizing guidelines and best practices for running SQL OLTP workloads on these arrays.

SharePoint Data Protection with Auto-Snapshot Manager/Microsoft Edition 4.7 Dell EqualLogic PS Series

The information in this guide is intended for administrators that have deployed SharePoint Server and are interested in using EqualLogic snapshots for efficient protection and recovery of SharePoint Server components.

Using Microsoft SQL Server with Dell EqualLogic PS Series Arrays

Best practices and configuration guidelines for deploying SQL Server with EqualLogic storage arrays.

EqualLogic Integration: Host Integration Tools for Linux with Auto-Snapshot Manager

Describes using Dell EqualLogic Host Integration Tools for Linux 1.3 Auto-Snapshot Manager to perform online Smart Copy protection and recovery with PS Series arrays.

EqualLogic Integration: Installation and Configuration of Host Integration Tools for Linux – HIT/Linux

Describes how to install and configure the Dell EqualLogic Host Integration Tools for Linux 1.3 with Dell EqualLogic PS Series storage arrays.

EqualLogic Virtual Storage Manager v4.0: Installation Considerations and Datastore Management

This guide focuses on two main data protection and mobility features available in Dell Compellent Storage Center: Synchronous Replication and Live Volume.

EqualLogic Virtual Machine Protection with Dell EqualLogic Virtual Storage Manager v4.0

This Technical Report focuses on the usage of the Dell EqualLogic Virtual Storage Manager v4.0 to coordinate VMware™ aware snapshots and PS Series SAN snapshots to provide an additional layer of data protection and recovery.

Deploying Solaris 11 with EqualLogic Arrays

Step-by-step guide to integrating an Oracle Solaris 11 server with a Dell EqualLogic PS Series Array

Data Center Virtualization using Windows Server Hyper-V, PowerShell, and Dell EqualLogic Storage

This collection of papers constitutes a technical solutions guide that demonstrates virtualizing a Microsoft Server environment using Hyper-V and Dell EqualLogic storage products in a Small to Medium Business (SMB). This guide focuses on Key technologies in Windows Server Hyper-V, Initial deployment of a Windows Server Core Hyper-V host, Deploying a dedicated Hyper-V management guest, and Protecting the Hyper-V environment using Dell EqualLogic storage and tools.

FS76x0 with VMware vSphere Deployment and Configuration Best Practices

This paper outlines key points to consider throughout the design, deployment and ongoing maintenance of a vSphere solution with Dell’s FS76x0 file-level storage. The topics included in this document provide the reader with fundamental knowledge and tools needed to make vital decisions to optimize the solution.

EqualLogic Configuration Guide 15.1

The goal of the Dell EqualLogic Configuration Guide is to help storage administrators determine how best to build an iSCSI infrastructure for use within an EqualLogic SAN solution. This document focuses on network configuration, host integration and other topics that help to ensure a smooth deployment with the goal of obtaining optimum SAN performance.

Switch Configuration Guides for EqualLogic SANs

Switch Configuration Guides for EqualLogic SANs provide step-by-step instructions for configuring Ethernet switches for use with EqualLogic PS Series storage using Dell Best Practices. New Dell and 3rd party switches were added to the document set. Sustaining updates were made to Dell PowerConnect and Force10 switches.

Rapid EqualLogic Configuration Portal

The Dell Rapid EqualLogic Configuration Series of documents is intended to assist users in deploying EqualLogic iSCSI SAN solutions. The documents employ tested and proven Dell best practices for EqualLogic SAN environments. They provide step-by-step instructions for configuring EqualLogic Arrays with Ethernet switches and hosts OSes. The documentation set was updated this quarter to include Red Hat Linux host guide and maintenance releases for switches, host guides.

dell storage future

Barton GeorgeRackspace’s DevOps Practice — DevOps Days Austin

Continuing with my interview series from DevOps Days Austin, today’s interview is with Matt Barlow.  Matt established Rackspace’s support offering around DevOps automation late last year.  Hear about it and how it all came to be.

Some of the ground Matt covers:

  • Matt’s background and how he got into DevOps
  • What led him to developing a practice
  • What exactly his team does
  • What types of customers have they been working with

Still to come from DevOps Days Austin: Dell Cloud Manager, Cote’s keynote

Extra-credit reading

Pau for now…


Rob HirschfeldOps Bridges > Building a Sharable Ops Infrastructure with Composable Tool Chain Orchestration

This posted started from a discussion with Judd Maltin that he documented in a post about “wanting a composable run deck.”

Fitz and Trantrums: Breaking the Chains of LoveI’ve had several conversations comparing OpenCrowbar with other “bare metal provisioning” tools that do thing like serve golden images to PXE or IPXE server to help bootstrap deployments.  It’s those are handy tools, they do nothing to really help operators drive system-wide operations; consequently, they have a limited system impact/utility.

In building the new architecture of OpenCrowbar (aka Crowbar v2), we heard very clearly to have “less magic” in the system.  We took that advice very seriously to make sure that Crowbar was a system layer with, not a replacement to, standard operations tools.

Specifically, node boot & kickstart alone is just not that exciting.  It’s a combination of DHCP, PXE, HTTP and TFTP or DHCP and an IPXE HTTP Server.   It’s a pain to set this up, but I don’t really get excited about it anymore.   In fact, you can pretty much use open ops scripts (Chef) to setup these services because it’s cut and dry operational work.

Note: Setting up the networking to make it all work is perhaps a different question and one that few platforms bother talking about.

So, if doing node provisioning is not a big deal then why is OpenCrowbar important?  Because sustaining operations is about ongoing system orchestration (we’d say an “operations model“) that starts with provisioning.

It’s not the individual services that’s critical; it’s doing them in a system wide sequence that’s vital.

Crowbar does NOT REPLACE the services.  In fact, we go out of our way to keep your proven operations tool chain.  We don’t want operators to troubleshoot our IPXE code!  We’d much rather use the standard stuff and orchestrate the configuration in a predicable way.

In that way, OpenCrowbar embraces and composes the existing operations tool chain into an integrated system of tools.  We always avoid replacing tools.  That’s why we use Chef for our DSL instead of adding something new.

What does that leave for Crowbar?  Crowbar is providing a physical infratsucture targeted orchestration (we call it “the Annealer”) that coordinates this tool chain to work as a system.  It’s the system perspective that’s critical because it allows all of the operational services to work together.

For example, when a node is added then we have to create v4 and v6 IP address entries for it.  This is required because secure infrastructure requires reverse DNS.  If you change the name of that node or add an alias, Crowbar again needs to update the DNS.  This had to happen in the right sequence.  If you create a new virtual interface for that node then, again, you need to update DNS.   This type of operational housekeeping is essential and must be performed in the correct sequence at the right time.

The critical insight is that Crowbar works transparently alongside your existing operational services with proven configuration management tools.  Crowbar connects links in your tool chain but keeps you in the driver’s seat.


Barton GeorgeSumo Logic and Machine Data Intelligence — DevOps Days Austin

Today’s interview from DevOps Days Austin features Sumo Logic’s co-founder and CTO, Christian Beedgen.  If you’re not familiar with Sumo Logic it’s a log management and analytics service.   I caught up with Christian right after he got off stage on day one.

Some of the ground Christian covers:

  • What does Sumo Logic do?
  • How is it different from Splunk and Loggly?
  • What partners and technology make up the Sumo Logic ecosystem?
  • What areas will Sumo Logic focus on in the coming year?

Still to come from DevOps Days Austin:  Rackspace, Dell Cloud Manager, Cote’s Keynote

Extra-credit reading

Pau for now….


Rob HirschfeldOpenCrowbar stands up 100 node community challenge

OpenCrowbar community contributors are offering a “100 Node Challenge” by volunteering to setup a 100+ node Crowbar system to prove out the v2 architecture at scale.  We picked 100* nodes since we wanted to firmly break the Crowbar v1 upper ceiling.

going up!The goal of the challenge is to prove scale of the core provisioning cycle.  It’s intended to be a short action (less than a week) so we’ll need advanced information about the hardware configuration.  The expectation is to do a full RAID/Disk hardware configuration beyond the base IPMI config before laying down the operating system.

The challenge logistics starts with an off-site prep discussion of the particulars of the deployment, then installing OpenCrowbar at the site and deploying the node century.  We will also work with you about using OpenCrowbar to manage the environment going forward.  

Sound too good to be true?  Well, as community members are doing this on their own time, we are only planning one challenge candidate and want to find the right target.
We will not be planning custom code changes to support the deployment, however, we would be happy to work with you in the community to support your needs.  If you want help to sustain the environment or have longer term plans, I have also been approached by community members who willing to take on full or part-time Crowbar consulting engagements.
Let’s get rack’n!
* we’ll consider smaller clusters but you have to buy the drinks and pizza.

Ravikanth ChagantiWindows PowerShell MVP for another year!

I am happy to share this news once again. In fact, for the fifth time The past year has been a very interesting one. With the release of PowerShell 4.0 and DSC, I had a chance to reach out to large number of community members and help them understand the new technology. I am sure…

Rob HirschfeldYou need a Squid Proxy fabric! Getting Ready State Best Practices

Sometimes a solving a small problem well makes a huge impact for operators.  Talking to operators, it appears that automated configuration of Squid does exactly that.

Not a SQUID but...

If you were installing OpenStack or Hadoop, you would not find “setup a squid proxy fabric to optimize your package downloads” in the install guide.   That’s simply out of scope for those guides; however, it’s essential operational guidance.  That’s what I mean by open operations and creating a platform for sharing best practice.

Deploying a base operating system (e.g.: Centos) on a lot of nodes creates bit-tons of identical internet traffic.  By default, each node will attempt to reach internet mirrors for packages.  If you multiply that by even 10 nodes, that’s a lot of traffic and a significant performance impact if you’re connection is limited.

For OpenCrowbar developers, the external package resolution means that each dev/test cycle with a node boot (which is up to 10+ times a day) is bottle necked.  For qa and install, the problem is even worse!

Our solution was 1) to embed Squid proxies into the configured environments and the 2) automatically configure nodes to use the proxies.   By making this behavior default, we improve the overall performance of a deployment.   This further improves the overall network topology of the operating environment while adding improved control of traffic.

This is a great example of how Crowbar uses existing operational tool chains (Chef configures Squid) in best practice ways to solve operations problems.  The magic is not in the tool or the configuration, it’s that we’ve included it in our out-of-the-box default orchestrations.

It’s time to stop fumbling around in the operational dark.  We need to compose our tool chains in an automated way!  This is how we advance operational best practice for ready state infrastructure.


Rob HirschfeldWho’s in charge here anyway? We need to start uncovering OpenStack’s Hidden Influencers

After the summit (#afterstack), a few of us compared notes and found a common theme in an under served but critical part of the OpenStack community.  Sean Roberts (HIS POST), Allison Randal (her post), and I committed to expand our discussion to the broader community.

PortholesLack of Product Management¹ was a common theme at the Atlanta OpenStack summit.  That effectively adds fuel to the smoldering “lacking a benevolent dictator” commentary that lingers like smog at summits.  While I’ve come think this criticism has merit, I think that it’s a dramatic oversimplification of the leadership dynamic.  We have plenty of leaders in OpenStack but we don’t do enough to coordinate them because they are hidden.

One reason to reject “missing product management” as a description is that there are LOTS of PMs in OpenStack.  It’s simply that they all work for competing companies.  While we spend a lot of time coordinating developers from competing companies, we have minimal integration between their direct engineering managers or product managers.

We spend a lot of time getting engineering talking together, but we do not formally engage discussion between their product or line managers.  In fact, they likely encourage them to send their engineers instead of attending summits themselves; consequently, we may not even know those influencers!

When the managers are omitted then the commitments made by engineers to projects are empty promises.

At best, this results in a discrepancy between expected and actual velocity.  At worst, work may be waiting on deliveries that have been silently deprioritized by managers who do not directly participate or simply felt excluded the technical discussion.

We need to recognize that OpenStack work is largely corporate sponsored.  These managers directly control the engineers’ priorities so they have a huge influence on what features really get delivered.

To make matters worse (yes, they get worse), these influencers are often invisible.  Our tracking systems focus on code committers and completely miss the managers who direct those contributors.  Even if they had the needed leverage to set priorities, OpenStack technical and governance leaders may not know who contact to resolve conflicts.

We’ve each been working with these “hidden influencers” at our own companies and they aren’t a shadowy spy-v-spy lot, they’re just human beings.  They are every bit as enthusiastic about OpenStack as the developers, users and operators!  They are frequently the loudest voices saying “Could you please get us just one or two more headcount for the team, we want X and Y to be able to spend full-time on upstream contribution, but we’re stretched too thin to spare them at the moment”.

So it’s not intent but an omission in the OpenStack project to engage managers as a class of contributors. We have clear avenues for developers to participate, but pretty much entirely ignore the managers. We say that with a note of caution, because we don’t want to bring the managers in to “manage OpenStack”.

We should provide avenues for collaboration so that as they’re managing their team of devs at their company, they are also communicating with the managers of similar teams at other companies.

This is potentially beneficial for developers, managers and their companies: they can gain access to resources across company lines. Instead of being solely responsible for some initiative to work on a feature for OpenStack, they can share initiatives across teams at multiple companies. This does happen now, but the coordination for it is quite limited.

We don’t think OpenStack needs more management; instead, I think we need to connect the hidden influencers.   Transparency and dialog will resolve these concerns more directly than adding additional process or controls.


¹ as opposed to Release Management.  Product Management determines what’s going into future releases while Release Management herds the cats for current and immediately pending releases.


Barton GeorgePagerDuty and incident management — DevOps Days Austin

Im picking back up the series I started last month,  DevOps Days Austin.  Today’s interview features Arup Chakrabarti of PagerDuty who presented at DevOps days and leads PagerDuty’s Ops engineering team.  Take a listen:

Some of the ground Arup covers:

  •  What PagerDuty does (hint: it has to do with Incident management and alerting for IT monitoring)
  • How they integrate with partners like SumoLogic, DataDog and NewRelic.
  • How the 3 founders took the experience they gained at Amazon around incident management to found PagerDuty
  • What to look for in the upcoming months

Stay tuned, there are still three more interviews from DevOpsDays Austin to come: SumoLogic, Dell Cloud Manager, Cote and his keynote.

Extra-credit reading

Pau for now…


Mark CathcartThe Texas Freedom Illusion

Governor Perry is well known for his brags that “the Lone Star State’s winning mix of low taxes, reasonable regulatory structure, fair court system and world-class workforce has been paying dividends” and bringing business to Texas, even when it isn’t true.

Courtesy the Dallas News

Courtesy the Dallas News

For the day to day Texan, their freedom is becoming increasingly an illusion too.

To have real freedom, you need choice. Increasingly Texans have no freedom, because they have no choice. This week, Attorney General Abbott confirmed the ban of releasing information to the public as Tier II reports in the 1986 Emergancy Planning and Community Right to Know Act (EPCRA).

What this means is, YOU, yes You! can no longer find out what dangerous chemicals are stored by businesses in your town, or your neighborhood. The sort of chemicals for example, that were the cause of the West, TX explosion last year. The Dallas Morning News in their research found 74 facilities that 10,000 Pounds or more of Ammonium nitrate or ammonium nitrate-based explosive material on site.

Given one of the startling discoveries post West, TX. that

“The fertilizer plant hadn’t been inspected by the Occupational Safety and Health Administration since 1985. Its owners do not seem to have told the Department of Homeland Security that they were storing large quantities of potentially explosive fertilizer, as regulations require. And the most recent partial safety inspection of the facility in 2011 led to $5,250 in fines”

wouldn’t you want to know this stuff was happening near your home, school, or work place? WFAA has a great video report showing Texas regulation at it’s best/worst. Confusion.

Gov. Perrys claim for low regulation is in fact obfuscated by the fact that plants like the one in West, TX are regulated by as at least seven different state and federal agencies. OSHA, the Environmental Protection Agency, the Department of Homeland Security, the U.S. Pipeline and Hazardous Materials Safety Administration, the Texas Department of State Health Services, the Texas Commission on Environmental Quality and the Texas Feed and Fertilizer Control Service. If you think this creates a smooth, efficient, low cost and safe way of managing risk, then “good luck with that”.

The good news is Texas isn’t much worse than many others, at least we still have the ultimate freedom, to leave. Unless of course it’s via the Mexican border where Abbott continues to press for the Federal Government to be tougher, fence more, fence frequently border control. What Texas does a best is apparently marketing the illusion of freedom via the cheerleader in general, Gov. Perry.


Footnotes