Access Workflows on the go with Slack and PagerDuty - Overview
Teleport allows you to implement industry-best practices for SSH and Kubernetes access, meet compliance requirements, and have complete visibility into access and behavior. But invariably, change happens. Teleport allows users to request elevated privileges in the middle of their command-line sessions and create fully auditable dynamic authorizations . These requests can be approved or denied via ChatOps in Slack, in PagerDuty, or anywhere else via a flexible Authorization Workflow API.
The Slack integration allows users to access role permission requests through Slack messages and approve from within the app.
The PagerDuty integration allows Teleport permission requests to function as PagerDuty incidents. They can be approved or denied through a PagerDuty special action.
Key Topics on Teleport Access Workflows:
- Upgrade your SSH infrastructure from keys to certificates
- Teleport uses certificate-based access with automatic expiration time
- Teleport to Slack integration allows you to treat Teleport access and permission requests via Slack message and inline interactive components.
- Teleport to PagerDuty integration allows you to treat Teleport access and permission requests as PagerDuty incidents — notifying the appropriate team, and approve or deny the requests via Pagerduty special action.
- Teleport to Jira integration allows you to treat Teleport access and permission requests using Jira tickets.
- Teleport to Mattermost integration allows teams to approve or deny Teleport access requests using Mattermost, an open source messaging platform.
What is next?
Slides on Securely Access Compute Resources in Cloud
The slides for Access Workflows on the go with Slack and PagerDuty are on slide share.
Introduction - Escape The Ticketing Turmoil: Access Workflows on the Go with Slack and PagerDuty
(The transcript of the session)
Mark: Well, let’s get this show on the road. Welcome to another Teleport Webinar. This one’s about escaping the ticketing turmoil. I’m sure you’ve all been there. You’ve got all these tickets, and you’re dealing with how to approve them on Slack, how do you approve them on PagerDuty. I’m joined by my colleague Ben Arent, who is the product manager for Teleport, and knows all things Teleport. So, Ben, why don’t you tell us a little bit about your background, and that cool accent I hear that you’ve got.
About Ben Arent
Ben: Thank you, Mark. Thanks for joining today. Welcome and good morning. I’m Teleport’s Director of Product, as Mark said. I’m British. This is the accent. I’ve worked on a range of monitoring systems, exception trackers, and databases. Now, at Teleport, I’m focusing on access systems. Obviously, I love enterprise ticketing systems, and I was a carefree SSH user until joining Teleport, and I’ll hopefully be sharing some of my journey that I learned on the way, on how you can really harden access to SSH, but also still provide more access and flexibility. And, if you didn’t notice that, the one lie is I do not love enterprise ticketing systems. They are kind of the bane of my existence in the past. And so, in this webinar, we’ll go through some of our ideas and thought process around how we’ve created access workflows. So you can get access and integrate with your current tooling, without a lengthy process.
Upgrading Your SSH Infrastructure from Keys to Certificates
Ben: And so, who are you? I pulled the list of attendees for this webinar. The majority of you are from DevOps and SRE background. I’m going to be touching on how you can sort of upgrade your SSH infrastructure from keys to certificates. I’ll be kind of light on this. We have a really good webinar in our archive from Gus Luxton on how to SSH properly and the How to SSH properly blog, and this is even vendor agnostic. You can upgrade your OpenSSH setup with this. I’d highly recommend it. For software developers, there shouldn’t be anything too new for you, but if you’re currently running into problems in your organization about gaining access, you might want to share this with your ops team about getting better access to your systems. For security engineering, locking down access assistance is always more favorable, and there’s a few interesting things that we will cover, where in which you can use techniques to really narrow down the access that you give employees in your organization, but you also make it very easy for break glass scenarios to gain more access. And I know we have a couple people on the call who have 1,000 to 5,000 developers in their organization, and that brings its own problems, and so we’ll sort of cover how you can use this to scale from small startups to very large enterprises, to sort of get it out of the way and get your team done.
Mark: Before you continue there, Ben, I just want to let the audience know that we will take Q&A at the end of the session, so if you have questions, there’s a Q&A button in Zoom that you can put it in there. Ideally, don’t put it in the chat. It’s easier for us to just monitor in the Q&A. I’ll keep track of those, so keep those going. And the first question we really got was: “Hey, are these slides going to be up on SlideShare?” Absolutely. After the session, we will get these posted, as well as a transcript. So they don’t have to take too many copious notes. But back to you.
Maintaining Access Visibility in a Remote World
Ben: Cool, thanks, Mark. So let’s talk about offices. These were places we used to go, pre-COVID, and there were some magical things about physical offices. Sometimes, if you were to plug into the network, you would get access to systems that weren’t available on the public internet. And you’re on a webinar about ticketing systems. You might have also had a Help Desk, or a Team, or some sort of procedure that you’d have to go through for gaining access. But one of the things now, with everyone working from home, these systems have become a lot more difficult to gain access to and get visibility into what people are doing and what they’re accessing. And so, traditional Help Desk would look something like this. It’s lots of internal systems and tooling that’s been built up over time, and they can be very painful to integrate with. When I worked at Rackspace, you would have 15 to 20 different systems that you’d need to go to for different teams within the organization, and sometimes, it could take a very long time to get the access that you needed. And then, if you’re an operator, and you provide access, do you know what that person did when they gained access to those systems? And, in our case, if you were using SSH certificates, did you remove the public key from those machines afterwards?
Secure Developer Access That Doesn’t Get in the Way
Ben: So we’re going to introduce Teleport. Teleport is our Golang tool that we provide. We have an opensource and enterprise product. This webinar mainly focuses on our enterprise customers, thus enterprise ticketing systems, and I’ll show you how our tool doesn’t get in the way. So to sort of step back a little bit, we’ll be focusing on upgrading from SSH keys to certificates. Going back, I said I was sort of a carefree user, I was in the SSH keys camp, but SSH certificates have been around for about seven years. They were introduced into OpenSSH in 6.2, but they can be difficult to maintain and manage. And one of the benefits of using a certificate is it’s issued for a short period of time, and you can configure Teleport for whatever time is good for your organization. So let’s say an employee comes in, they log in, they get a certificate for that period of the day. So if their laptop’s stolen at the bar at the end of the night, it’s okay. The person would have to re-authenticate with their identity provider to get access. And we also provide short-lived Kubeconfigs as well, and the same level of auditing.
What Does a Teleport Deployment Look Like?
Ben: And you might also be thinking, “Hey, it’s 2020. Should we be accessing into machines? Is it an anti-pattern?” We still see customers having break glass scenarios, debugging, there’s always some team that has root [inaudible] service, so that team that you do want to provide access to, you can use Teleport to audit, limit, and scope access. What does a deployment look like? This diagram can be a little intimidating, but I’ll quickly go through it. You have your private networks, and these can be your Amazon VPC, and you have your Auth Gateway. So Teleport sort of splits into these two different modes. The Auth Gateway is our proxy, and you can think of this as a bastion server. So it’s a hardened node outside of your network perimeter that — we go through rigorous security testing, so you’re fine to expose this to the public internet. And you can then use this to access other machines. You might also have heard of jump servers? This normally refers to access between [inaudible] networks. And Teleport can work in this mode through trusted clusters.
Ben: And so, the experience of a developer logging in. They use our TSH command-line tool. They sort of go through our gateway to Auth server. They talk to our identity provider, which enforces second factor. This denotes what team and role they have. We have a certificate authority that issues the certificate, and then they go about their daily business. And so one thing not in this diagram, which we’ll touch on in another slide is rollbacks and rules of how you can approve or deny role escalation.
Why Ticket Plugins and Access Workflows?
Ben: So why ticket plugins and access workflows? To step back a little bit, ChatOps was this term coined by GitHub back in the campfire days, and with Slack and other tools becoming more commonplace for — conversationally, having your co-deployments, your notifications, security event responses, but unless you integrate a tool like Lita or other chat bots, there’s sort of little workflow in that tool. And I’ll go through an example of how the Teleport Slack plugin can be used to approve or deny access.
Sample Request of Slack Integration with Teleport
Ben: And another challenge, if you’ll be like, “Teleport sounds great, but it’s just another tool in our toolbox, we don’t want to be logging into yet another thing,” and that’s also one of the other benefits of using our plugins. And here, we have an example request of our Slack integration, but I’ll just show you in our demo. So for this demo, I’m going to be myself here, and we have Travis, our head of customer success, who is the intern in my fake organization. And so these are Teleport roles that have been mapped to our email identity that have come through Okta. So I’m the admin in this role, Travis, he maps to intern, but he can request this dba role. So here’s some yaml on the slide. It’ll be on SlideShare, but you can see you can set up max_session_ttl, so this is eight hours of access. Another pretty handy thing you can do is you can set in logins, so this would be where you would set EC2 user or host, but you can also configure it to be external logins, or you can use an email address and use a PAM integration to automatically create the Unix user’s own login.
Ben: Another very powerful feature is our node labels, and you can use these to filter which nodes they can see, and then you can use principles to limit access. And one part that is cut off here, you can see — you can define which roles can request other roles. And so, this role can request the dba role. Okay. So hopefully, the demo gods will be good to me today, and I will be demoing our Slack integration.
Mark: Oh, my goodness. A live demo. Ben, you are brave.
Ben: I know. I am.
Mark: Very brave.
Demo Time: Slack
Ben: Okay, so I’m not logged in. So I’m going to log in as Travis using our identity provider. So this browser [inaudible] — this would normally fire up in a browser, but since I’m on this webinar, I’m just going to open up in another window. And so I am now currently going through an Okta flow. And I’ve successfully logged in, and you can see here, I am now logged in as an intern, and I only have the tmp principle. And so before I go too deep into how this works, I’d like this to sort of be a background of our own keys and certificates. So this is probably what you’re familiar to for when you create public-private keys, and before Teleport, you would publish your public key to your whole team. You might have Ansible or an S3 Bucket that each machine gets its keys. And so, Teleport is very similar. It stores all of its certificates in another directory. So here we have all the short-lived certificates used for access. And one of the benefits of using tsh ls is that you can quickly see your hosts.
Ben: So here, we have one of our nodes, and you can see that here are some of our labels, and there’s actually a dynamic label that pulls in the operating system. But if I do tsh ssh, let me see, two user at that host, it’s access denied. And it is access denied, as you can see. I don’t have access for the logins. I only have tmp, I don’t have the correct privilege. So now, I’m going to elevate my role and request this dba role. And in my Slack room, here, you can see we have a room here for this access request. We have the ID — we know which cluster it is. It’s Travis requesting this dba role. And it’s currently pending. So I’m going to approve his request.
Mark: And Ben, this is you acting — the Slack thing, we’re seeing Travis on the left and you in Slack. Is that the way of thinking this?
Ben: Yes. Yeah. This is sort of — I’m the admin and the intern.
Mark: That’s fine. Yep.
Ben: So now, I have these new logins, ec2-user, tmp, root. So let’s try accessing this machine again. And so now, I’m logged into this host. So this is an EC2 server that I have running. And I’m just going to run this other script here to show you one of our Enhanced Session Recordings. Okay, so now I can exit. Okay, and now, I’m going to log out for the next demo. So one of the other benefits of Teleport is we provide a full audit log. So this is the web UI that’s available on the proxy, which also has audit log capabilities. So if I come in here, you can see a similar sort of UI. And in this audit log, you can see the session that just happened. We ran htop, and then we exited and ran another command. And so this can be very helpful along with our audit log, just to see what happened. But one of the other benefits you might have seen, I had an echo, a base64 encoded string, and you can see it code out to a network request to an [inaudible]. And so, you get all this information that you can then pipe into an SIEM solution and [inaudible] if there’s any sort of weird commands happening for that user.
Slack Integration Tips
Ben: Okay, now I’m going to go back to my slides. Okay. So just a summary. You can approve or deny a request from the Slack room. You have an audit log of the approval session recorded, and one thing I didn’t really touch here is the Slack plugin, you can run alongside the proxy, or you can run it on another machine, but it has to be available to the public internet for webhooks. There’s a few gotchas though. In this scenario, you saw there was a room. Anyone in that room can approve it, so we’re best to limit it to a private room. If that’s not strong enough for your team, you can always just set it up to only notify to a room, and then you can use tctl auth, which is a tool that you run on the admin server to approve requests. And then you do also need an extra port for Slack.
Ben: So PagerDuty, that’s the next integration we’re going to go through. I think it needs little introduction with this audience. It provides very powerful on-call management and alerting in escalation. Okay, so we’re just going to do the same thing with PagerDuty. Go back to my [inaudible]. Okay. So and I can actually just go straight and — let me log in again as Travis.
Mark: Yeah, I was going to say, so you’re Travis again. We see you logging in as Travis.
Ben: Yeah, as Travis. So Travis for role escalation. So hopefully, it has remembered my user. Yep, so now I’m logged in, and because I logged out, I only have the intern role again and tmp, so I’ll have to go through role escalation again. And so this is sort of a similar kind of workflow. And one thing you might hear that I’ll — just give it a second. I have a ping on my phone so this is a very handy integration that the PagerDuty app — it’s probably not great in this webinar, but you can see on my phone, I have the ability of — the notification that Travis requested it, and here, I have the ability to approve the requests on the go. And so now, he has been approved.
Mark: Again, [inaudible] the live demo, so, yeah [laughter].
Ben: And if I just go back here —
Mark: Good job.
Ben: — you can see in PagerDuty, you get the same sort of feedback that you can configure PagerDuty. You can escalate and alert the similar person, whoever’s on call, to approve these requests. And then, there are also actions in the PagerDuty app that you can approve or deny the requests. So you don’t even need to go back to the tool to get your job done. Okay, and we go back to our slide deck.
Teleport & PagerDuty
Ben: So another summary, you can approve or deny requests, or you can use the mobile app for actions. I didn’t go through the audit log in this example, but you also have the audit log and session recording. It can be set up to auto-approve. I’m going to dive into this next. And note, also, anyone who’s on that role can also approve the request. So one of the really cool benefits of PagerDuty is it has this very advanced scheduling system. And part of this is that you can connect your Okta user identity, which is the email address, also with your PagerDuty user, which would also be an email address. And you can define a period of time, whoever’s on call. If they request a role, they would automatically get that role approved if they’re on call. And this is a very powerful feature because you could — if you’re doing “follow-the-sun”, or you have limited on-call for your security team, whoever’s on call has — when they’re not on call, they have a very limited role, but when they’re on call, they get a much more extensive role. And so you’re sort of underprovisioning the resources for that user.
Ben: And so, in my case, if I was on duty and I requested super-admin, it would automatically get approved and no one else would need to help me, but if Gus needed it, he could also request it, and you could go through a similar flow to approve it. And these plugins configured with sort of TOML configs, which is very similar to YAML, and it’s as simple as just setting a tree flag. And if you have multiple service IDs, you can just sort of configure it on multiple of these. Another thing I haven’t touched on yet is how you auth with Teleport. And so you export keys and certificates that access the Teleport API, and these do need access to the auth server. So there’s a little bit of hardening there.
Mattermost and Jira Plugins for Teleport
Ben: We have two other plugins, we have Mattermost and Jira. Mattermost is an open-source chat tool. I think Jira needs little introduction. And these are open-source plugins. We actually worked with an external development team, the Evil Martians, to help expand some of these plugins and extend our API. So if you go to this Git Repo, you can see some of the examples of how they have built and extended the Teleport API to build these plugins. And this can be a good foundation if you have your own internal tooling that you want to integrate with. Of course, if you have any kind of request that you think are probably more generic, like you really want Microsoft Teams, please leave an issue in our tracker. And since this has been released, we’ve already got a bunch of feedback, and so, we’re extending our approval workflow. We have this request for discussion currently open. Teleport is an open-core company, so please review this, leave comments. The more open feedback we get, the better our product is.
Extended Approval Workflows
Ben: And for this extended approval workflow, these are some of the key benefits. So one very helpful thing is being able to request access to clusters. This is very helpful if you are an MSP, which is the Multiple Service Provider, and you’re managing other people’s clusters, and you want to access those clusters, and maybe get the owner of that cluster to approve it. Accessing based on some ticket number that you’ve already been assigned or a request. This will be very helpful if a developer needs access to a system, but they don’t know exactly what they need to access, they can send in a note. When the operator running Teleport gets that, they can create a role, or it can be programmatically created to provide access narrowed specifically for what their request is. And another feature is just to request access to specific nodes, and we’ll add a flag so you can pull the API manual lead, so you get the access request approved, so it’s not always in real-time. All right. So, Mark, any questions?
Mark: Yeah, thanks a lot. So, for the audience there, definitely use — we’ve got a couple of questions come in already, but add any more. Just click on this little Q&A button at the bottom of your screen. The first question I have here, Ben, is: “Are the plugins open source?” The thing you were showing from Evil Martians and stuff, is that all in open source?
Ben: Yeah, it’s on github.com Gravitational/Teleport-plugins.
Mark: Great. Second question was: This whole workflow access — is this available in open source as well, or is this just part of the Enterprise product?
Ben: It’s only in our Enterprise offering.
Mark: Cool. Only in Enterprise. So drilling down a little more into maybe the plugin [inaudible], question is, “How hard it is to actually write these plugins in go?”
Ben: So it uses gRPC API. We have many examples, which are very flexible and extendable. We also have a generic webhook. And so, if you have a tool like Zapier or some other automation, that makes it a little bit easier to integrate with your tooling. So you need to, ideally, be a Go developer, but you use the webhook if you want to integrate with other tools.
Mark: Great. Another one is: “How does Teleport know what on-call schedule to check for auto-approval?”
Ben: It is tied in PagerDuty. So when you create the service, you can — let me come in here. So we have configuration, we have services.
Mark: So the auto-approval thing you’re talking about— this is a function of PagerDuty?
Ben: Yeah, PagerDuty is kind of like the brain of who’s on call.
Mark: Got it.
Ben: So we basically send the name — I can’t find it right now, but I can follow up for that question.
Mark: No, that’s fine. Yeah, so I think clarity’s just that the stuff you’re showing there, it’s sort of a PagerDuty functionality, and it’s whatever PagerDuty uses to determine on-call schedules.
Mark: Okay, awesome. Another question came in is: “Hey, will this recording and slides be available afterwards?” Absolutely. We’ll post the recording of this, give us a couple of days, the slideshow will go up, as well as will a transcript. So for those of you worried about that. What other questions? I think the last question I have here, and, well, it was around auto-approval. I think we already discussed it, too, is, “Is auto-approval a Teleport function or PagerDuty?” We already discussed that one. I think those were the main — oh, a couple more just came straight in as I was about to close up there in questions. So if we use ServiceNow, would we be able to use the webhook for determining the on-call person?
Ben: We did research ServiceNow, and I know we have another customer looking at it. I think it depends on the version. It can be a bit complicated, just because the ServiceNow product’s sort of huge. So I think the best case would be get in touch with us. We can probably figure out some sort of solution for you.
Mark: Okay, awesome. And Vladimir, if you’re out there, I just want to — I’ll read this question, I didn’t — and if you understand it, Ben, let me know, but it’s: “Is it the same servers as for sending approval requests? In this situation, people seeking auto-approval receive the PagerDuty message.”
Ben: People seeking auto-approval —
Mark: I wasn’t sure —
Ben: No. So if you are — if I’m reading this correctly, if you’re trying to request — if you’re auto-approved, no one is notified.
Ben: If that makes any sense in the PagerDuty world.
Mark: Yeah. Vladimir [crosstalk] —
Ben: I know Vladimir, so I can follow up afterwards with him.
Mark: Okay, cool.
Ben: Thanks for answering.
Mark: Hey, no problems there. Any last questions out there? I think those are the ones that sort of I see in the queue. No more jumping out. So, Ben, definitely thank you for your time, and thank you for your effort in sort of going through this. What is the next sort of — if someone’s sort of interested in this, and wants to get involved, where would you send them to? What are the main next steps for everyone?
Recommended Next Steps
Ben: Yes. I mean, this might be a bit deep in the weeds, but if you’re already a Teleport user, I would go through the Teleport Approval workflows doc. Like I mentioned, you do need the enterprise product, so contact our sales team, and they can get you a POC and trial license set up so you can try this out. If you’re an open-source user, and you just want to use the [inaudible] infrastructure, you can just download Teleport. I’d run it — this workflow is not in the opensource edition, but it’s still a great product to use. And we’re also on GitHub, we’re an open-core company, you can see we have a request for discussion. So create issues, feel free to reach out to us on GitHub.
Some Key URLs
Mark: Well, hey, Ben, appreciate your time and effort, and like Ben said, if you do have questions, reach out to us. We also have a community website, you can ask general questions there, but thanks to everyone that’s attended the webinar. Let your friends and family know to watch the recording, if everyone wasn’t able to see it. And Ben, thanks a lot for your time and your ability to do two live demos without anything crashing.
Ben: I know.
Mark: So that’s a new record.
Ben: Thank you, Mark. Thanks, everyone.
Mark: Thanks, everyone. Bye-bye.