Category Archives: Uncategorized

You Are What You Read

Story Horse?

I like to join and be part of fill Groups, Meetups and Hackathons, mainly technologies ones.
When I arrived in this country, I had the opportunity to start or help to start the It in Dublin, that is a group to IT Brazilian who lives in Dublin.

I found this Blog post a few months ago and I share this in the group.
This maybe changed how few others thinking about.

One of this mind changes decided to create one code event: “codeAcademy : Knights of the old republic” codeAcademy

I’ve been always that kind of person, which likes to learn practising. So in order to get more knowledge on I decide to join and learn coding with fun.

you can read the summary: here

Dublin/Ireland it’s a awesome place for who works as a developer. why?

The idea of “Write Code Every Day” may not change me in the first time, but now I think that I realise how this is important and how I like to do.

Thanks a lot to @AndresGrams and @le_jucas which spent this weekend with me.
I would like to join other code events like this and I really happy that somehow one simple post changed a community mindset or at least few ones there.

Hope le this was the first step …

Hack.guides() 2016 Tutorial Contest – I support

Story Horse?

Today I want to share the Hack.guides() 2016 Tutorial Contest, a tutorial competition.
http://tutorials.pluralsight.com/contest/
http://tutorials.pluralsight.com/faq/

Is a good idea to share the knowledge and help the community. I really liked the idea and I decide to participate and support this. I already public my tutorial  there and I thinking and do another one. I wrote about Red Sqirl:

http://tutorials.pluralsight.com/big-data/red-sqirl-data-analytics-platform-introduction?status=in-review

You can see more about Red Sqirl Here:

https://igfasouza.wordpress.com/2016/07/13/red-sqirl-overview/

https://igfasouza.wordpress.com/2016/07/14/red-sqirl-first-step-tutorial-with-pokemon/

http://www.redsqirl.com/

Red Sqirl – first step tutorial with Pokemon

Story Horse?

Today I’ll show you how to take your first steps with Red Sqirl.

This tutorial is based in this Docker image: https://hub.docker.com/r/redsqirl/cloudera/

I’m going to show you how start a basic ETL using Red Sqirl with a major trending topic in the moment. Pokemon Go

I just Googled and found a list of Pokemon: http://pokemondb.net/pokedex/all

and a list of all Pokemons inside the Pokemon Go game: http://www.pokemongodb.net/2016/05/pokemon-go-pokedex.html

I just copied the tables in these two cvs files. pokemon.cvs and pkemonGo.cvs – I’ve also removed the special characters and images.

To start we are going to copy one file to docker and then copy this file to Hadoop.


sudo docker ps //this is going to show the imageID example (3d7ac2fccb23)

sudo docker inspect 3d7ac2fccb23

sudo docker cp pokemon.csv 3d7ac2fccb23:/tmp

now you have the file inside the docker image.

Now on Red Sqirl click on Remote File System.

  • Click the plus symbol to create a new ssh connection to a remote server.
  • Host : localhost
  • Port : (do not select)
  • Password : Give your password (in this case is redsqirl)
  • Save (check this if you want to save)
  • now we are able to see the file.
  • Click the plus symbol to create a new directory on hadoop file system and give it the name pokemon.mrtxt
  • now drag and drop the file from Remote to HDFS
  • now you can do the same to the other file.
  • Create a folder called pokemonGo.mrtxt

Create a Workflow
Setup a Source Action

This Task will show you how the source “action” can be configured to select flat files and change the properties such as the delimiter of the file and also the headings and types of the file.

In the Pig footer drag a new Pig source icon onto the canvas.

  • Double click to open source.
  • Name the action “pokemon“.
  • Comment the action “this is a tutorial using Pokemon data“.
  • Click OK.
  • On the data set screen, click on the path field or on the button.
  • Click on the radio button beside “pokemon.mrtxt”- if you cannot find it refresh the view by clicking on the search button, or you need navigate on the file system.
  • At this stage, you will see the data correctly display on the screen, the name of the fields are “Field1 string, Field2 string…”
  • On the feature title line, click on the edit button.
  • Once it appears you can choose “Change Header”
  • Copy and paste “Name STRING,Type STRING,Total INT,HP INT,Attack INT,Defence INT,Spatk INT,Spdef INT,Speed INT” into the value field.
  • Click OK. You will have the confirmation that the Header is correct.
  • Click OK to exit from the Configuration window.
  • If you leave the mouse cursor on the source action you will be able to see some configuration details

Now we are going to do the same steps but you can call this action as pokemonGO and select the pokemonGo.mrxt as a source. The header is “NAME STRING, TYPE1 STRING, TYPE2 STRING”

with tow source on the canvas we can do the Pig join

Perform a Pig Join Action

Drop a pig join onto the canvas.

  • Create a link from “pokemon” to the new pig join action.
  • Create a link from “pokemonGo” to the new pig join action.
  • Double click the pig join and call it “pokemonData”.
  • The first page list the table aliases, click next.
  • On the following page, make sure that “copy” is selected as the generator and click OK.
  • Here we remove name and type from pokemon table.
  • Click next.
  • This page has two interactions that specify the join type and the fields to join on, we use the default join type which is “Join” so this does not need to be changed.
  • In “Join Field” column, type “pokemon.name” and “pokemonGo.name”. This condition will join the two tables together.
  • Click next.
  • Click next on the sorting page.
  • Click OK on the final page.

Now you have a simple Pokemon Go dataset to start your data analyses …

IMG-20160714-WA0005

Red Sqirl – Overview

Story Horse?

Today I want to show a little bit about the project that I have been working on the last 3 years called “Red Sqirl”.
redsqirl

Red Sqirl is a web based big data application that simplifies the analysis of large data sets.

I’m going to talk a little bit about the Architecture but you can have a look here: http://www.redsqirl.com to see all other details.

Red Sqirl is a web application that you can install directly on top of a Hadoop Cluster. Current available only for Tomcat.
Uses Tomcat as a web service, but when you are logging in, it will create another process owned by the logged in user and make key components available on RMI. Every action on the application is run through the users’ process to avoid permission conflicts.

Architecture – java web application based in JSF framework and HTML5. You can see all the source on github. https://github.com/idiro/redsqirl

Red Sqirl contains a drag & drop view. The user drag objects to a canvas to build a workflow. The technology here is kinetic.js – you can see a basic intro: https://igfasouza.wordpress.com/2014/01/08/html5-the-future-kineticjs/

The canvas is where a workflow is contained which is used in Red Sqirl to manage a jobs processes or flow. A workflow is a build up of processes that chain together and perform corrective actions that produce a desired output. It is a way of managing a job so that each aspect of the job can be modified to use desired parameters.

Basic the user double click one object and configure fill parameters to performance a task. This task are submitted to Oozie and Oozie manages workflows so they can be run in parallel to other jobs.
More about Oozie – http://oozie.apache.org/

Red Sqirl runs in parallel with the Hadoop platform and other Hadoop Technologies. The main technologies for storage are Hive and HDFS. These jobs can be saved and be used again in the future. Saved jobs can be open and modified to be run with different parameters. The output of these jobs will be saved to appropriate storage facility (Hive or HDFS)

Hadoop is a distributed system that allows for MapReduce processes to be run over the data that is stored in these technologies.
More about Hadoop – http://hadoop.apache.org/

Once you finish you can share your canvas. We call this a Model. A model can be shared in a marketplace to others users. http://marketplace.redsqirl.com/
Just need fill a form with some information about and upload the zip file.

Red Sqirl is extensible, so you can create a new plug-in using a new technology. We call this a Package. Packages are groups of actions that are used to perform specific processes on Red Sqirl. You can see here how to install or  upload a Package: http://www.redsqirl.com/packagemanagement.html and here how to create : http://www.redsqirl.com/pckdev.html

You can see all Model an Packages here: http://www.redsqirl.com/search.html

Red Sqirl support the all trending Hadoop Ecosystem APIs. The idea is a online that you can do Data Analysis in a simple way.  You don’t need know the PIG syntax and the Spark syntax to use than.

In a future I’ll do a first step Red Sqirl tutorial.

Software Craftsmanship

Story Horse?

Software Craftsmanship is a metaphor that can radically transform the way that we create and deliver software systems, with implications for the way we develop software, manage teams and deliver value to the users. Is an approach to software development that emphasizes the coding skills of the software developers themselves. http://en.wikipedia.org/wiki/Software_craftsmanship

The software craftsmanship movement talks about practising as a way to to develop programming skills to become software craftsmen. Technical practices are considered to be important, it takes time to learn them and become better programmers.

The manifesto for software craftsmanship

The book Clean Code: A Handbook of Agile Software Craftsmanship (Robert C. Martin) is an excellent place to start if you haven’t. I’m not the first and definitely not the last to compare coding to craftsmen in today’s world and in previous times in history.

I would suggest reading Software Craftsmanship: The New Imperative

Doing a Google search I found Building Software Craftsmen
“To become craftsmen programmers need to gain “real-world experience and practical applications of knowledge”

How can programmers develop their skills to become software craftsmen?
Googled again and “Why I Don’t Do Code Katas
“if you want to get better at something, repeating practice alone is not enough. You must practice with increased difficulty and challenge.”

“craftsperson is someone who not only creates something from nothing from materials of their choice, but usually puts a part of themselves into what they make.”

The Codesmith

Anyone Can Be A Codesmith

Further

http://dannorth.net/2011/01/11/programming-is-not-a-craft/

http://www.infoq.com/news/2014/11/becoming-software-craftsmen?utm_content=buffere9dbf&utm_medium=social&utm_source=twitter.com&utm_campaign=buffer

Top10 Books for Developer

Story Horse?

Recently I saw the web site “41 Websites Every Java Developer Should Bookmark” and had the idea to make my list. One list of top books. I gave some research and found this.

I look at my top ten list which includes many of the same books as on his list, but my list has a few that are different.

A list of books that every programmer should read.

Domain-driven Design: Tackling Complexity in the Heart of Software – Eric Evans

Patterns of Enterprise Application Architecture – Martin Fowler

Refactoring: Improving the Design of Existing Code – Martin Fowler

Clean Code – Robert C. Martin

The Clean Coder – Robert C. Martin

Design patterns : elements of reusable object-oriented software – Erich Gamma

The Pragmatic Programmer – Andrew Hunt

Refactoring to Patterns – Joshua Kerievsky

Head First Design Patterns – Kathy Sierra

Enterprise Integration Patterns: Designing, Building, and Deploying Messaging Solutions – Gregor Hohpe

Startup the basics

Story Horse?

I was talking with some friends about Startup and discovered that I have some good references materials for people who are interested in the area.

First of all, What is a Startup?

Top 10 books

Business Model Generation – Alexander Osterwalder – http://www.businessmodelgeneration.com/

The Lean Startup – Eric Ries – http://theleanstartup.com/

Running Lean – Ash Maurya – http://theleanstartup.com/the-lean-series

The Four Steps to the Epiphany – Steve Blank

The Entrepreneur’s Guide to Customer Development – Brant Cooper – http://www.custdev.com/

The Startup Owner’s Manual – Steve Blank and Bob Dorf

Art of the Start – Guy Kawasaki – http://www.guykawasaki.com/the-art-of-the-start/

The Other Side of Innovation – Vijay Govindarajan – http://mba.tuck.dartmouth.edu/pages/faculty/chris.trimble/osi/

Sua ideia ainda não vale nada – http://www.evmportal.org/index.php?option=com_k2&view=item&id=645:presentaci%C3%B3n-de-la-escuela-virtual-del-mercosur-y-taller-oportunidades-de-la-formaci%C3%B3n-virtual-para-emprendedores-y-pymes&Itemid=296

O Livro Negro do Empreendedor – Fernando Trías

Links I recommend

http://bizstart.com.br/

http://startupweekend.org/

Blog – http://www.startupms.com.br/nossos-livros/

Blog steveblank – http://steveblank.com/2011/09/22/how-to-build-a-web-startup-lean-launchpad-edition/

Blog Venture Hacks – http://venturehacks.com/

Blog Startup Marketing – http://www.startup-marketing.com/

Blog OnStartups – http://onstartups.com/

Blog For Entrepreneurs – http://www.forentrepreneurs.com/

Blog instigator – http://www.instigatorblog.com/

Blog A Smart Bear – http://blog.asmartbear.com/jason-cohen

Blog Startup Lessons Learned – http://www.startuplessonslearned.com/

Motivational Video – https://igfasouza.wordpress.com/2013/12/31/only-makes-mistakes-who-tries-something/