Introduction A computer lets you make more mistakes faster than any invention in human history - with the possible exception of handguns and tequila. --Mitch Ratliff The Internet has gone through a massive transformation since it's inception. From a tool used mostly by academics, it has come to be a pervasive tool used by just about everyone to communicate, shop, pay bills, invest, and entertain. While the use cases never cease to increase, one aspect of Internet usage that is rather problematic is educating the public about the risks involved in living a connected life, and what are the ways people can defend against attacks.Read more
Over the past couple of years my team has iterated several times on the proper way of managing systems using Puppet. For a while it was a gigantic time sink while we tested and prototyped several different appraoches to configuring things with many frustrating failures. This post will be an exploration of some of the lessons learned. Lesson #1: Puppet is not deterministic Yup, that's right. The tool you're trying to use to get all your servers to a deterministic state isn't very deterministic in resolving that state.Read more
Tinc is a neat little VPN daemon that I've recently come across. It is surprisingly simple to configure yet powerful. In this post, I'll show you how to setup a meshed VPN between four nodes with one of the servers acting as a DHCP server. In this fictitious scenario, let's assume the following nodes: dev is a CentOS cloud server with a fixed public IP address, we'll designate this one as our DHCP serverRead more
In this post, I'll go over how to use iptables and ipset to create a basic firewall with ssh brute force protection and geo-blocking. I'm assuming CentOS here, adjust paths/commands accordingly for other distributions. Ipset is a tool to create and maintain IP sets in the Linux kernel. The advantage of using ipset over setting up a bunch of individual rules is one of CPU utilization. Ipset can handle thousands of entries without CPU degradation, wheras introducing thousands of rules in iptables will have a noticeable impact on packet processing speeds.Read more
After using PDI for a while, you start to encounter some common problems. PDI crashes, databases die, connections get reset, all sorts of
interesting things can happen in complex systems.
As a general rule, when building PDI jobs that should behave monotonically I always strive to find a way to make a job re-playable and
idempotent. This can be tricky given an unlimited input set over time.
Probabilistic data structures to the rescue!
To do this, at work we created a PDI bloom filter step (thanks Fabio!). This article will go over how it works and it's use cases.Read more
Here's a quick post that explains how to do something which may not be obvious.
The scenario: You've got some serialized data stored in a not-so-portable data interchange format (serialized PHP),
and would like the data to be made available as part of a PDI transformation.
A common problem when starting a new project is getting fixtures in place to facilitate testing of reporting functionality and refining data models. To ease this, I've created a PDI job that creates the dimension tables, and populates a fact table.Read more
Here is an alias that I've used often to view packet payloads using tcpdump which filters out all the overhead packets (just contains payloads). I usually stick the following lines into my .bashrc on all the servers I install. alias tcpdump_http="tcpdump 'tcp port 80 and (((ip[2:2] - ((ip&0xf)<<2)) - ((tcp&0xf0)>>2)) != 0)' -A -s0" alias tcpdump_http_inbound="tcpdump 'tcp dst port 80 and (((ip[2:2] - ((ip&0xf)<<2)) - ((tcp&0xf0)>>2)) != 0)' -A -s0" alias tcpdump_http_outbound="tcpdump 'tcp src port 80 and (((ip[2:2] - ((ip&0xf)<<2)) - ((tcp&0xf0)>>2)) !Read more