My logo

Hello I'm

Vishal Barad

Data scientist || Data engineer || Pyspark || Python || SQL || 1x Airflow certified || Databricks || Azure


About Me

Need a Data scientist, Data engineer and Python Developer to build the solution?

I'll help you create a well-structured, scalable solution with my python, machine learning and deep learning, pyspark skills as scalable web app with django and django rest framework skills that’s perfect for you.

I'm a Data scientist and Data engineer with experience in various technologies which includes Machine learning, Deep learning, NLP, cloud technologies, Software engineering etc. Skilled engineer with around 2 years of experience and demonstrated history of good architecture and technical knowledge in Data science and engineering.

My toolkit includes python, pyspark, machine learning, NLP, deep learning, django/django rest, flask, PostgreSQL, Airflow, azure, databricks as well as tools like Heroku for deployment. I'm also always happy to learn new tools!


My Works

https://dribbble.com/shots/9649617-2020-Festival-Brand-Identity

IPL match score prediction (Freelance)

The purpose of this project is to predict the first inning IPL match score based on Venue, Batting team, Bowling team, Runs, Overs and Wickets etc.

Responsibilities

• Gathered dataset. Dataset used by this project is ‘ipl.csv’ downloaded from kaggle. After downloading and importing dataset(in jupyter Notebook) i did data cleaning first like dropping some unnecessary columns, handling missing values, performing one-hot encoding on categorical variables.

• Divided dataset into dependent and independent features. (Independent features=Venue,runs,wickets,etc.. | Dependent feature=total)

• Dropped first 5 overs data in every match because first 5 overs are powerplay over ,soi just ignored it.

• splitted data into training data and testing data.And then i perform model selection in which i chose ‘Multiplelinear’, ‘Ridge’, ‘Lasso’, ‘Decision tree’ and ‘Random forest’ regression algorithm and count accuracy score.

Technologies

Python, Machine learning regression algorithm, Numpy, Pandas, Joblib, Flask, Heroku, HTML, CSS

https://dribbble.com/shots/9649617-2020-Festival-Brand-Identity

Movie recommendation (Self Project)

The purpose of this project is to recommend top 10 similar movies which user watched before. This is project based on ‘model based collborativefilering’ ml algorithm. 

Responsibilities

• Gathered dataset. Dataset used by this project is ‘movies.csv’ and ‘ratings.csv’ downloaded from movielens dataset ‘https://grouplens.org/datasets/movielens/latest/’. After downloading and importing dataset(in jupyter Notebook) i did data cleaning first like dropping some unnecessary columns, handling missing values, Merging two datasets ‘movie’ and ‘rating’ on ‘movieId’.

• created ‘pivot table’ and then ‘sparse matrix’.

• fitted ‘sparse matrix’ using ‘NearestNeighbors’ model.

• used ‘kneighbors()’(which returns distance and indices) method of ‘nearestneighbor’ model to predict the similar user. And just predicted top 10 similar movies.

• Saved model using ‘joblib’ library. After that I creted UI in flask

• Deployed on ‘Heroku’.

Technologies

Python, Machine learning regression algorithm, Numpy, Pandas, Joblib, Flask, Heroku, HTML, CSS, Pycharm

https://dribbble.com/shots/9649617-2020-Festival-Brand-Identity

Smiley face detection (Self Project)

This project is made using deep learning CNN neural net and opencv. This project detects whether we are happy or neutral.

Responsibilities

• Gathered dataset. Dataset used by this project is downloaded from kaggle.

• Data augmentation to generate more data to avoid overfitting

• Applied CNN’s different-different layers like conv2d, maxpolling, dropout, filter at last flatten layer.

• Compiled model based on optimiser and accuracy matrix

• Saved model using ‘joblib’ library. And used opencv to detect face and predict output.

Technologies

Python, Deep learning, Numpy, CNN, Opencv, Pandas, Joblib, Flask, Heroku, HTML, CSS, Pycharm

https://dribbble.com/shots/9649617-2020-Festival-Brand-Identity

Face mask detection (Self Project)

This project is made using deep learning CNN neural net and opencv. The purpose of this project is to detect whether person has wear mask or not.

Responsibilities

• This is project based on CNN model. Dataset used by this project is 'https://drive.google.com/drive/folders/1P3gIgFUMbdl5tSqx1pK385bz4t0mxEvW?usp=sharing'. Dataset conatins train, test and validation data in each there are two class 'with_mask' and 'without_mask'. After downloading and importing dataset(in Google colab) i did data image augmentation first like rescale image, rotate image, flip image, zoom image etc.

• After that i just compile model using 'Adam' optimizer with learning rate=0.01, loss='sparse_categorical_crossentropy'.

• After that i just fit data using 50 epochs.At the end i got 98% validation accuracy and 96% training accuracy.

• Then again i just saved model as 'final_Face_mask_scratch.h5'. After downloading model i made 'prediction.ipynb' file using Opencv to predict output.

Technologies

Python, Deep learning, Numpy, CNN, Opencv, Pandas, Joblib, Flask, Heroku, HTML, CSS, Pycharm

https://dribbble.com/shots/9649617-2020-Festival-Brand-Identity

Rasa product chatbot (Freelance)

This project is made using NLP rasa chatbot. The purpose of this project is to assist best product based on user's question and answer.

Responsibilities

• Developed simple RASA chatbot

• Designed workflow

• Extract keyword from user answer using NLP

• Recommended best product to user

Technologies

Python, Pycharm, mysql, machine learning, NLP, RASA chatbot

https://dribbble.com/shots/9649617-2020-Festival-Brand-Identity

Xplotue (Infozium solution PVT LTD)

Xplotur is an online travel platform that tries to exceed traveler’s expectations with absolute determination and commitment. It offers travel planning, Itinerary design, transportation facilities, hotel booking, Flight booking services. 

Responsibilities

• Developed Hotel module

• Communicated with hotel API provider

• Integrated hotel API with frontend

Technologies

php, VS code, mysql, wordpress

https://dribbble.com/shots/9649617-2020-Festival-Brand-Identity

Voiceme project (Kintu designs PVT LTD)

VoiceMe is an online bank application in which user can do bank truncation with their voice by just communicating with rasa chatbot (same as Siri).

Responsibilities

• Developed simple RASA chatbot

• Designed workflow

• Perform text to speech using GTTS(Google Text-to-Speech) library

• Extract keyword from user voice using NLP

• Perform speech to text using SpeechRcognition library

Technologies

Python, Pycharm, mysql, machine learning, NLP, RASA chatbot

https://dribbble.com/shots/9649617-2020-Festival-Brand-Identity

Simplyloose (Kintu designs PVT LTD)

Simplyloose is the IOT based online personal training platform for Fitness experts- Doctor, Dietitians Gym, Yoga Trainers, Physiotherapists to better connect to their customers when training them online or in-person. 

Responsibilities

• Scraped data using BS4 and Scarpy

• Designed workflow

• Perform data pre-processing on scraped data

• Used scraped data to built machine learning model

• It was a classification problem so I have used random forest classifier algo to predict diet and recipes.

• Applied natural language processing to extract keyword of user input during chat

Technologies

Python, Pycharm, mysql, machine learning, web scarping, Django

https://dribbble.com/shots/9649617-2020-Festival-Brand-Identity

Tamil Nadu e-Governance Agency – SFDB, Govt. of Tamil Nadu (Vedity software pvt ltd and KPMG)

One of the core functions of the this project is to deliver the public services to its citizens. With increased resident expectations, the government is expected to deliver the public services in a more convenient manner. To meet these expectations, the government intends to provide timely and convenient delivery of public services, social benefits and improve the efficiency of public service delivery.

Responsibilities

• Data preprocessing pipeline design

• Wrote pyspark script for data ETL. We used pyspark here because the size of the data is around 7 crore.

• Defined pipeline in using airflow

Technologies

Python, pyspark, airflow, pandas, machine learning algorithm, postgres sql


Blogs

Data Preprocessing Using Pyspark (Part:1)

Apache Spark is a framework that allows for quick data processing on large amounts of data. . . . . .   


Unsupervised learning algorithm

In unsupervised learning algorithm we don't supervise the model, but we let the model work on its own to discover information that may not br visible to the human eyes. . . . . . .


Skills & Tools

Python

language

Machine

learning

Deep

learning

Pyspark

tool

Microsoft

azure

Apache

airflow

SQL

Postgresql, Mysql

Azure

databricks

Django 

| Django rest framework

Web

scraping

Natural language

processing

Pandas

tool


Work Experience

Data scientist | Data engineer

Vedity software pvt ltd

Ahmedabad

Sept 2021 - Current

Full-Time

Data scientist

Kintu design pvt ltd

Surat

May 2021 - Sept 2021

Full-Time

Full stack developer

Infozium solution pvt ltd

Surat

Feb 2021 - May 2021

Full-Time

Data science and python developer Freelancer

freelacer.com and upwork.com

Surat

Nov 2020 - Sept 2021

Full-Time

Testimonials

Nshini

freelance - July 2021

Build me a chatbot using Rasa or any open source programs

The work was professionally done within the requirements and budget. It was easy to communicate with him and he answered all my queries promptly. I will definitely recommend him.


Skills: Python Programming, Software Architecture, Chatbot

Soham A.

freelance - Nov 2020

Boosting regression algorithm solution on paper

Good Job done Vishal. Project completed successfully and happy to work with him in future.


Skills: Python, Algorithm, Report Writing, Machine Learning (ML), Statistical Analysis

Ashish K.

freelance - Dec 2020

Prediction of English Premier league using Artificial Intelligence

Good job Vishal , keep it up !


Skills: Machine Learning (ML), React.js, Artificial Intelligence, Blockchain, Ethereum

freelance - Nov 2020

Machine learning classification -- 3

have good knowledge of machine learning and has experience in coding too. Delivered the project on time.


Skills: Python, Matlab and Mathematica, Machine Learning (ML), Deep Learning

 Fahmi E.

freelance - Oct 2020

Expert Django Rest Framework Required for Clean Code Design Review and Refactor - Hourly Consultation

An excellent django developer. Vishal helped me restructure my django code infrastructure. Fast and responsive. Straight to the point.


Skills: Python, Django, RESTful API

Get in touch

If you like my profile and according to you I'm a best employee who fullfill your work then get in touch with me or you can directly mail me on [email protected].

*
Built with Dorik