-->

Saturday, April 18, 2015

Remote Work + Data Science

I've been working as a remote data scientist for nearly a year now. Our team (of two!) is fully distributed and we're in the process of adding another data scientist. Finding other remote data science jobs is pretty difficult so I decided to start another blog to champion the idea of remote data science and track jobs that fit that description. Please visit www.RemoteDataScience.com and let me know what you think!

Tuesday, April 14, 2015

Linux Date Injection into Hive

This week I found myself needing to generate a table in Hive that used today's date in the output location. Basically I was running a daily report and wanted it to automatically send the output to the appropriate bucket on S3.

To accomplish this, I used a combination of embedded Linux commands and Hive variables.

First, in your Hive query, you need to turn on variable substitution:

set hive.variable.substitute=true;

Next, in your Hive query you can have an expression substituted for the variable value. For instance, you can create a table like this:

CREATE EXTERNAL TABLE IF NOT EXISTS my_table
(
    values STRING,
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
LINES TERMINATED BY '\n'
LOCATION 's3://mybucket/${hiveconf:DATE_VARIABLE}';

The Hive syntax for a variable is ${hiveconf:VARNAME}When calling Hive, you can give it a variable by using the -hiveconf VARNAME=VALUE syntax. For instance:

hive -hiveconf DATE_VARIABLE=$(date +y=%Y/m=%m/d=%d/h=%H/) -f query.sql

Notice that the value of the variable in the above query is $(date +y=%Y/m=%m/d=%d/h=%H/). This is the syntax for telling Linux to execute the command inside the $( ) and return the value. You can also use backticks ( ` `) instead of $( ). Essentially the date command will run, returning a date string like y=2015/m=04/d=13 and assign that to the Hive variable. That variable will then be substituted in the Hive query and build a custom table location.

Super handy.

Sunday, April 5, 2015

Ham Technicians license

After about 15 years of it being on my "to do" list, I finally took, and passed, the ham technicians license exam. After 10 years of EE education it wasn't all that difficult. I did read the excellent guide from KB6NU (http://www.kb6nu.com/study-guides/) to get me up to speed on the regulation aspects and the "ham lingo" I didn't know. 
I'm not sure what I'll do with it, but it's nice to know I have more spectrum and transmit power accessible for when j figure it out!