Step by Step guide from Installation to best practices and tips on creating annotations on Prodigy for your next NLP project

If you have never used spaCy, it would be good to go through this article where the author explains clearly (blatant self-promotion 😏 ) how to get started in spaCy. (An in-depth understanding is provided in 7 minutes — flat !)

What is Prodigy ?

From the makers of spacy, (the NLP package) comes…

An intuitive understanding of the torchvision library — basics to advanced ( Part 1/3 )

What is torchvision?

Torchvision is a library for Computer Vision that goes hand in hand with PyTorch. It has utilities for efficient Image and Video transformations, some commonly used pre-trained models, and some datasets ( torchvision does not come bundled with PyTorch, you will have to install it separately. )

About this article

This series of…


Use these 3 simple techniques to speed up your webscraping using beautiful soup

Why is this useful and important ?

Most of the times when you scrape a site for pulling public data for your datascience projects, you end up doing it over a loop ( sometimes it means doing it over a few thousand times) and every second that you save in your loop adds up significantly in the…

NLP — Natural Language Processing

And how spaCy v3.0 can help you get started with these use-cases

In this article we will list down and explain the different use-cases and give a brief overview on how we can use SpaCy to go about doing the same.

Oh, in case you don’t know what SpaCy is — it is one of the more popular industry strength NLP solutions…

6 ways in which you can format your string/numerical output in python

About this article

Most of the time, a sample print() statement might lead to messy code. And sometimes you do not need a logger to handle the job. Python offers different output options to deal with your string outputs. Here are a few

Formatting strings

1. Using %s in statements

print(“Some text %s some text %s some text “ %(str1,str2))

How to remove the HTML tags from your corpus for building your NLP data-set

This article is part of the supporting material for the story — ‘Understanding NLP — from TF-IDF to transformers


Most of the times when you want to process a tonne of html files in your corpus, you would have to think about cleaning the HTML as a pre-processing step.

A list of things-to-do for data pre-processing for creating Machine Learning data-sets ( and a few handy TIPS )

In this article

In this article, we will see what the data processing steps involved in pre-processing are, and some relevant codes in python to perform these actions.

We will also see the need to build an exhaustive check-list of pre-processing steps that you can apply on your data-set. A starter checklist is…

6 reasons why learning using kaggle competitions do not train you for the real world data science problems.

First things first — We have been on kaggle for over 4 years and would like to acknowledge the role Kaggle has played in our Data Science journey, as a very good learning platform for the beginner.

A little background — Recently we got the badge for becoming a competitions

Lars Nielsen

Making AI available to everyone. One commit at a time. A.I. M.L| | Quantitative programming | IoT | Python | Arduino | Comp Vision |

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store