Practical Tips for fresh computer science engineers to claim a better job.

This article is a part of our recent workshop for final year engineering students. All are waiting for their placement companies and expecting a good job before the end of 2017.
Yes, they are very right with their expectations and they can even get a better job earlier. No, I am not going to reveal any secret trick and not going to ask to buy any course to get a job as fresher. What, I am upto are just few selling points to position yourself better online. Make your presence right and recruiters themselves would find you with a job offer.

Let’s discuss it in some points, and take these points as decreasing priorities. So, first thing first.

  1. Never Copy your resume from someone else. Your resume is the gist of yourself and you should not copy it from others. Write your resume from scratch and if you find yourself in trouble then Microsoft Word’s resume draft would help you in making a good resume. You also need to spend time over your resume, make it error free and you should know everything about your resume.
  2. Don’t write too much in your resume to grasp. Usually, I see fresher writing C, C++, Java, JavaScript, Linux, Mac, Ubuntu, Big Data, and every jargon they ever heard. Beware, this approach actually killing your chances of selection. Your selection based upon the communication with the interviewer and if you are writing too much then you are confusing interviewer. So, you are cutting your chances short by killing the communication. Ideally, you should mention only those skills which you know and also have used in some project. Still, if you want to mention more and more skills, then segregate them as per level of expertise of some marking labels.
  3. Prepare your projects well and also work on the architecture (technical stack). The projects, you have done during your pre-final and final year, plays deciding role in your job interview. A well-written project can increase your chances multi fold. So, mention your genuine projects in your resume and also mention a bit about technical stack and architecture. If you mention the front end tech, backend tech, server, and development tool; that would be enough at fresher level.
  4. Don’t forget, Tell me about yourself. This is the favorite question of every interviewer and this the key to start an interaction. So, you should prepare this question well enough and take mock in front of a mirror, teacher, friends, seniors, or whoever you want. Your answer should contain the things which are not obvious on your resume. So don’t recite your resume in reply to this question.
  5. Start using the LinkedIn profile as early as possible. Ideally, you should create your LinkedIn profile in the first year, but if not then create it and use it as early as possible. LinkedIn is the most powerful business, job finding tool ever. Your LinkedIn profile should be error free and should be well managed. Don’t overdo things, just make an honest effort and put things honestly and clearly. LinkedIn profile building helps itself makes you expert in creating a 100% LinkedIn Profile.
    Here, I am revealing a trick first time in this article.We all have a tendency to visit profile back if someone has visited us. So, you just need to use this trick wisely. Make a search of prospective companies and find the profiles of HR and Senior Managers of those companies. Then start visiting 10-15 of them per day, once you are ready to take a job in the market. Even if 30% of them visit your profile back, you will be in the sight of a prospective employer. And that is the time when your perfectly build profile would make you out of the crowd.
  6. Use GitHub, StackOverflow, HackerRank, HackerEarth, and other online coding platforms. GitHub and StackOverflow profiles are as essential as your resume. So start committing your code on code repository and build your coding profile. Whenever you search through StackOverflow, do this after logging into your StackOverflow account. Also, start participating and answering the questions. This will increase your visibility in no time. HackerRank and HackerEarth are two amazing platforms among many. Once you are a bit comfortable in any programming language, start using above two competitive programming platforms.
    Many companies, likes of Amazon, Flipkart, SnapDeal, Uber, and others, use to recruit directly via online coding challenges. So it’s upto you to secure your place at these platforms to get a better job.
  7. Be yourself. Don’t fake anything if you want to sustain in long run. First, you need to know your career objective then write that objective in your resume. It’s perfectly okay not to get a job in the very first interview and okay even not getting a job in tens of interviews. But getting a job by faking yourself would make you pay a cost in long run.
    If you lack some skill then build that skill. Trust me, you can master any skills, any skill which ever exists on earth if you practice that skill for 10000 hours. So, it’s your journey to be a master at your skill and it’s your destination to grab the best available job.

Step by Step guide to install WordPress on local XAMPP environment.

WordPress is the most powerful CMS framework and it is most widely used globally. Knowing WordPress is a serious skill and you can earn good money with the help of WordPress.

Let’s start with the introduction and set up WordPress on a local machine inside XAMPP environment.

X- Mac or Linux or Windows
A- Apache
M- Mysql
P- PHP
P- Perl

One can easily download XAMPP for once Operating System by clicking on XAMPP Download link . Once reached on the download page, choose the setup file for your Operating System and follow the steps for installation. In case you get stuck in some problem the refer to XAMPP Installation FAQ  for more information. 

Once done with installation then start the XAMPP application. Either you can start it from startup icon or from application folder. Once started then start Apache and Mysql server.

Start XAMPP Server

Apache and Mysql server started

Now test, whether everything is running fine or not. Just go to any browser on your machine and type localhost in URL and hit enter.

XAMPP running

Now, your XAMPP server is up and running and currently your Apache and Mysql servers are in running state.

Now move to application folder and then XAMPP Files and then to “htdocs” folder. Folder names might be different based on the Operating System you are using.

XAMPP htdocs folder

This htdocs folder is the root folder of your local server, which is running with the help of XAMPP server. Here you can find your index.php file. This is the file which is showing when we hit localhost in our browser. You are free to change anything here and you can also change index.html file.
Now, cut all the files and folders of htdocs and put them inside backup folder. After doing it, nothing is showing at localhost except a backup folder.

Now download WordPress setup and then unzip the setup files. Once unzipped then your WordPress folder must have below files in it. If that, then copy the WordPress folder and paste it htdocs, the root folder of the local server.

WordPress folder structure

Now, type localhost once again and this time WordPress setup page should come up. So, we need to setup a Mysql database and a user who have priveledges on that database.

Now type localhost/phpmyadmin or click the PHPMyAdmin link to open PHPMyAdmin.

Now click on databases tab and put wordpress as database name, you are free to use any database name here.

PHPMyAdmin

Select ‘wordpress’ (recently created database) database and click on “Privileges” tab. Then select the option “Add user account” in the bottom the f page. Here we are creating new user to kill the confusion and to make sure that no other application would not get affected.

Now, create a user by giving a name to it. I am putting “wp” as user name, then select host as “localhost” and put a password also. I am putting “wp1234” as my password.
Then scroll down and select option “check all” in global privileges section. Once done then move to the bottom right of the page and hit “Go” button.

Once done, we are ready with running Apache Server, running MySQL server. We have an empty database named, ‘wordpress’ and also a user named, ‘wp’. This user has password ‘wp1234’ and has all the global privileges on database wp.

Let’s edit the wp-config-sample file and run the WordPress on our machine.

wp-config-sample file

Open wp-config-sample.php file in text editor of your choice and edit it. Put DB_NAME as ‘wordpress’, DB_USER as ‘wp’, and DB_PASSWORD as ‘wp1234’. DB_HOST should be left as localhost, because we have database on same machine and it’s on localhost. Now save it and run your first WordPress blog by going at localhost/wordpress Url.

Have fun!!! and also try this on hosted server. Feel free to ask your queries in comments.

What Hadoop is not ?

 

Hadoop and MapReduce

Hadoop is a buzzword nowadays and people think it like a magic. But Hadoop is not a magic and not a general purpose application either. Hadoop is a program that is suitable for specific problems where

  1. Data volume is very large.
  2. Data velocity is comparatively higher.
  3. And organization needs highly distributed framework for their business needs.

So, if you are thinking Hadoop as a replacement of your regular database or regular filesystem then you would be highly disappointed. Here we are focusing on what Hadoop cannot instead of what Hadoop actually can.

Apache Hadoop is not a replacement of regular Database: Databases are great and they use SELECT command against indexes of stored data. Databases are organized in a way to fulfill the end user’s commands at best possible way and with time efficiency. Now, if you replace your database with HDFS then Hadoop would store data in form of files and you can’t directly access your data with regular SQL-like commands. You need to write MapReduce jobs for accessing your data and that would not be an easy task. It would take efforts and would also take time in execution.

Hadoop is a solution where data is very large, size threshold is not the point where you need the license to access data, large means where regular databases are unable to perform efficient queries. Hadoop system is a solution where data format is not regularized and can’t be feed to the regular database directly.HBase is pretty useful if you actually want to use Hadoop and also want to fetch data via SQL-Like commands.

Hadoop and MapReduce are not a place to learn Java Core: If you just introducing yourself to programming and especially Java Programming then Hadoop is not the right place to start with. Hadoop documentation says that you just need basics of Core Java to efficiently use Hadoop and MapReduce API’s. But you need to have knowledge of Java Errors, File paths, and Java Debugging prior to starting with Hadoop.

Hadoop is not the ideal place to learn Networking error messages and Linux System programming: Hadoop path is a lot easy if you already are familiar with “Connection Refused” and “No route to Host” error messages. Hadoop has nothing to do with such kind of networking error messages. So, ideally, you should know TCP-IP errors, LAN handling, and other common network protocols. Hadoop expects that clusters are well connected and network knowledge is needed to ensure that. On the same ground, you also should know your way around Linux/Unix systems. You should have prior basic knowledge of how to install Unix/Linux. Hadoop framework expects users to have knowledge of how to handle DNS errors, how to keep logs on separate disks other than root disks and should also know what files are there in etc/ directory.

Apart from this you should also brush your skills over

  • SSH, what it is, how to set up authorized_keys, how to use ssh and scp
  • ifconfic, nslookup and other network config/diagnostics tools
  • How your platform keeps itself up to date
  • What the various log files your machine generates, and what they mean
  • How to set up native filesystmes and mount them

 

 

Distributed file systems: Introduction to HDFS.

HDFS

The HDFS, Hadoop Distributed File System, is a distributed file designed to hold and manage very large files (some terabytes or petabytes). Files are stored in a redundant manner across the various machine, spread over the network, to ensure the high availability and to ensure durability to failure.

What is Distributed File System?
A distributed file system is designed to hold the large amount of data across the network and destined to provide access to various clients distributed across the network.
Network File System (NFS) is oldest among DFS’s and still in heavy use. NFS is the most straightforward system but has the limitation as well. NFS is designed in a way to provide remote access to a client to a single logical unit on a single remote machine. The client can see this unit and even can also mount that unit on her own machine as well.
Once mounted on a local machine, the client can use that unit as a part of her own machine and can use Linux and other file related commands directly on that unit.
But when we talk about the storage space we are again limited to the single machine and we can be choked after a top computational power. The single machine has its own limitations in terms of power and storage, hence NFS might not be useful in terms of Big Data implementations.
On the other side, HDFS is designed to perform the job specifically for Big Data environment. It has the clear edge on NFS and other DFS systems. While using HDFS, a client never needs to have a local copy of data before processing that data. If we move specifically then:

1. HDFS is designed to store a very large amount of information (terabytes or petabytes). This requires spreading the data across a large number of machines. It also supports much larger file sizes than NFS.
2. HDFS files are available on various machines across the network so we can use normal commodity hardware in place of high-end machines. This arrangement uses to cut cost up to great extent.
3. HDFS stores data reliably on various machines and in a case of even complete hardware failure data is available with at least two more locations.
4. HDFS provides fast access to information and if a large number of clients want to access machine we can simply increase the power of machine by adding more clusters and by adding more commodity hardware to our systems.
5. HDFS has dedicated tool for distributing data among various machines and for getting it back after processing. MapReduce performs this job and it has proper sync with Hadoop.

Even if Hadoop is very good for large scaled environments, it is not good for general purpose systems like NFS is. Hadoop is optimized to perform highly scaled up jobs where thousands of cores needed to work, it is best optimized for thousand or even more clusters. Hence if we are using Hadoop on the single machine or on a small number of clusters then Hadoop can’t provide competitive results.
Hadoop is designed and implemented on Google File System (GFS). It was properly described by a white paper published by Google to explain it internal working for file storage.
We would discuss the block based file structure of HDFS in next article. You can get a better grasp by accessing the Google’s white paper (given in the link above) over GFS.

Problem Statement: Large scale distributed environment and Hadoop.

Hadoop ChallengesHadoop is a buzzword since some past years and nowadays people actually are working with Hadoop ecosystem at an enterprise level.

Even after the use of Hadoop at an enterprise level, newcomer and especially students use to face a lot of difficulty in getting an understanding of Hadoop.

“Hadoop is a large-scale distributed batch processing infrastructure. While it can be used on a single machine, its true power lies in its ability to scale to hundreds or thousands of computers, each with several processor cores. Hadoop is also designed to efficiently distribute large amounts of work across a set of machines.” ** Yahoo Developer Network

Large scale files never meant for some gigabytes, it actually means some terabytes or even petabytes of files. Big Data actually consists of three V’s. Volume (of Data, some petabytes), Velocity (of Data), Veracity (of data, an uncertainty of data). Some people also include 4th V for Variety (of data).

Now let’s point out the main challenges of large-scale distributed systems. We are just pointing out those challenges and possibly we would share them in more details in other articles.

1. The network can expect partial or total failure anytime. Router or switch might face break down.

2. Data, when needed, may not arrive at some specific point due to unexpected network congestion.

3. Individual computing nodes may get failed due to overheating or some other issue. Or even can face issue due to run of out of memory space.

4. The client might have multiple software versions at different nodes or even client might be using different data processing software. This might be a big issue in large-scale distributed systems.

5. Security concerns among various nodes as Hadoop don’t specify any specific security protocol for various nodes. So, it is next to impossible to detect and rectify a man in middle attack.

6. The clock speed of various processors is again a big concern as it is essential to make them properly synchronized.

So, while setting up a large scale big data environment, one must take care of above critical challenges. Hadoop ecosystem addresses these challenges and personal preferences also work sometimes to address other challenges.

Please share your doubts and suggestions, would share more of my learnings and experience with you soon.