HN Jobs

A searchable index of Hacker News “Who is hiring?” job postings.

← All postings · September 2011 thread

Job posting (auto-parsed — see raw text)

Websitereddit.com
LocationLos Gatos / San Francisco Bay Area
Salary
Apply viaEmailjedberg@netflix.com
Hiring notes
TechPythonJavaRuby
Parsed locationsLos Gatos / San Francisco Bay Area
Posted byjedberg
PostedSep 1, 2011
SourceView on Hacker News ↗

Original posting

Los Gatos / San Francisco Bay Area Netflix I'm hiring for my team (although there are a ton of other jobs too) The description is a little light on programming, but it really is more programming than sysadmining. Netflix is a very open environment -- any engineer can push code to production pretty much any time with almost nothing in the way. There is no release manager or schedule. Maintaining reliability in this environment is a fun challenge! Our team has three main goals: * Write tools to help the other engineers know when it is safe to deploy. * Create monitoring tools to detect issues before users do, fix them automatically if possible, and if not, contact the right people as quickly as possible. * Take charge of outages and lead the calls until they are resolved and then follow up to make sure the root cause has been found and fixed. So if this sounds like something interesting to you, you can send your resume to me at jedberg@netflix.com, and if you have any questions about the job, feel free to comment here (but don't email for questions, because I'd rather answer them here were everyone can see the answer). Here's a discussion about the job on reddit: http://www.reddit.com/comments/jyaqd/ Here is the full job description from the jobs site: Netflix is the world's leading streaming video service, and our growth is accelerating. At Netflix, we are upgrading our cloud management tools and pushing the limits of using cloud-based technology, powering our explosive (and soon to be international) growth while presenting new challenges to build a reliable service with ephemeral commodity hardware in an engineer friendly environment.. As a member of the Cloud Solutions team, you will manage, support and operate the company’s cloud environment. You will build tools to monitor, automatically fix and/or proactively notify service owners of problems before customers notice. You will drive incident resolution and follow through on finding root causes and getting them fixed. You are an expert in distributed, highly concurrent, web-scale systems that are fault-tolerant and run 24x7 with unparalleled availability. You are a talented devops engineer and you thrive on managing and maintaining a reliable environment that others depend on. You possess these qualities: * You see the big picture delivering a 24x7 service * You are effective working with multiple teams * You have high standards in everything you do * You can balance multiple tasks You have these skills: * Great communication skills, both verbal and written * In-depth experience operating a 24x7 production environment * Fluent in Linux: RedHat, CentOS, Fedora, Ubuntu * Strong scripting and programming skills (we’re going to ask you to write code on the whiteboard) * Familiar with the Java platform, especially JVM configuration and JMX * Knowledgeable in Linux packaging tools: rpm, yum, dpkg, apt * Ability to quickly triage problems, determine root cause and drive resolution * Ability to keep a cool head under pressure and effectively participate in system down crisis situations You may even have these skills: * Expertise in one or more of the following: Java, Python, Ruby, Perl, shell * Prior experience with Amazon EC2/S3 or other cloud service providers * Building systems deployment and service management automation tools * Familiarity with large scale systems and methodologies If this sounds interesting then we want to hear from you!