Skip to main content

Importance of exploratory data analysis of dummy variables, logit/probit using eviews

By IMPRI Team 
 
IMPRI Generation Alpha Data Centre (GenAlphaDC) along with IMPRI Impact and Policy Research Institute, New Delhi conducted a Two-Week Immersive Online Hands-On Certificate Training Course on Exploratory Data Analysis with Categorical Variables Regression Models Dummy Variables and Logit/Probit using EViews, on December 10 and 17, 2022. The expert trainer for the course was Professor Nilanjan Banik, Professor at Mahindra University. He is a Visiting Consultant at IMPRI and an Academic Consultant with Geneva Network, United Kingdom, and a Senior Consultant with Hankuk University of Foreign Studies, South Korea.
The convenors for the event were Prof Vibhuti Patel, Visiting Professor at IMPRI and a Former Professor, Tata Institute of Social Sciences (TISS), Mumbai; Dr Soumyadip Chattopadhyay, Associate Professor, Economics, Visva-Bharati, Santiniketan and a Visiting Senior Fellow, IMPRI; Dr Arjun Kumar, Director, IMPRI. The training course had participants from the field of data and policy– including students, professionals, researchers, and many others.

Day 1 | December 10, 2022

The session began by going through the basics of Regression. The first question he pondered upon is what the meaning of a “dummy” is. He stated that in essence, it means a replica. Here, in a regression model, if X is a dummy variable, it means that it is a qualitative variable. He began by laying down some assumptions about the dependent and the independent variables. First, X and the Error Term (e) are not related, if related, there will be a problem of endogeneity. X is not quantitative if it is a dummy variable. He explained how we can constitute various qualitative traits in a dummy variable such as gender, and ethnicity among others in a regression model. It tries to capture the impact of any variable that is qualitative in nature.
Second, he mentioned that a dummy variable can capture any break or shift in data. He used the example of the Indian economic reforms of 1991, which was a breakpoint in terms of per capita GDP levels. After 1991 there was a big jump in GDP growth. In other words, there was a structural break. Dummy variables can capture such structural breaks. Thirdly, he mentioned that dummy variables can also be used to de-seasonalize the data. Using Excel, he showed how to incorporate dummy variables in a regression model and how dropping a dummy variable is important in order to avoid a Dummy Trap. He also showed how to de-seasonalize the data, using Excel. After de-seasonalizing the graph turned out to be more stable than before.
After explaining over Excel, he showcased the same data set on EViews. He selected the variables such as sales figures, trends, etc. He then created the dummy variables out of the four quarters. He ran the regression without the dummy first. Then he showcased a data set where he introduced a dummy and ran the regression. The data set used was US Trends in Gross Personal Income and Gross Personal Savings from 1959 to 2007. The dummy variables reflected the recession points from 1981 to 1984. The regression diagram consequently showed the breaking point due to the recession of 1981. The session ended for the day with this, after which, Professor Banik went on to take questions and clear doubts of the trainees. The next class was saved to learn Logit and Probit Models.

Day 2 | December 17, 2022

The second day of the session conducted by Professor Nilanjan Banik, titled, “Exploratory Data Analysis with Categorical Variables Regression Models: Dummy Variables and Logit/Probit using EViews” was devoted to the concepts of Logit and Probit. Professor Banik started by explaining the basic equation of a regression model, and the components within it. Here, the motive was to explain the concept of dummy variables, and the Probit/Logit model, when the variables X and Y are qualitative respectively. Then with an example of hourly wage rates, he showed how to interpret dummy variables for various categorical variables.
After explaining dummy variables, he followed it up by talking about Logit functions. He mentioned that in Logit functions, the dependent variable, Y, takes values of 1 or 0. The Logit or Probit model describes the odds of an individual meeting the outcome variable, given a certain trait or characteristic. He mentioned the importance of LR tests in Probit models. The Logit/Probit models primarily deal with the dependent variable (Y). He showed that the Y variable takes values between negative infinity to positive infinity. He proved this by showing the method to derive the value of Y using Probability. Since the P value will be between the value of 0 and 1, the Y value will take the value of negative Infinity to positive infinity.
After delving into theory, he started a practical lesson on the above discussions with the help of a data set on EViews. First, he showed how to introduce dummy variables on a set of observations. Then, Professor Banik went on to show how to interpret Logit functions on EViews. For this, he again used the previous US data on Savings and Income to show the recession point. Using other data he showed how smoking is affected by age, income, and education. He explained what the P value shows using the formula for the same. He also showed it practically based on the regression model and the results generated from it. Then he took questions from the trainees which he promptly clarified. With this, the two-day training course ended.
---
Acknowledgement: Aaswash Mahanta is a research intern at IMPRI

Comments

TRENDING

Modi’s Israel visit strengthened Pakistan’s hand in US–Iran truce: Ex-Indian diplomat

By Jag Jivan   M. K. Bhadrakumar , a career diplomat with three decades of service in postings across the former Soviet Union, Pakistan, Iran, Afghanistan, South Korea, Sri Lanka, Germany, and Turkey, has warned that the current truce in the US–Iran war is “fragile and ridden with contradictions.” Writing in his blog India Punchline , Bhadrakumar argues that while Pakistan has emerged as a surprising broker of dialogue, the durability of the ceasefire remains uncertain.

Incarceration of Prof Saibaba 'revives' the question: What is crime, who is criminal?

By Kunal Pant* In 2016, a Supreme Court Judge asked the state of Maharashtra, “Do you want to extract a pound of flesh?” The statement was directed against the state for contesting the bail plea of Delhi University Professor GN Saibaba. Saibaba was arrested in 2014, a justification for which was to prevent him from committing what the police called “anti-national activities.”

Why Indo-Pak relations have been on 'knife’s edge' , hostilities may remain for long

By Utkarsh Bajpai*  The past few decades have seen strides being made in all aspects of life – from sticks and stones to weaponry. The extreme case of this phenomenon has been nuclear weapons. The menace caused by nuclear weapons in the past is unforgettable. Images of Hiroshima and Nagasaki from 1945 come to mind, after the United States dropped two atomic bombs on the cities.

Manufacturing, services: India's low-skill, middle-skill labour remains underemployed

By Francis Kuriakose* The Indian economy was in a state of deceleration well before Covid-19 made its impact in early 2020. This can be inferred from the declining trends of four important macroeconomic variables that indicate the health of the economy in the last quarter of 2019.

Food security? Gujarat govt puts more than 5 lakh ration cards in the 'silent' category

By Pankti Jog* A new statistical report uploaded by the Gujarat government on the national food security portal shows that ensuring food security for the marginalized community is still not a priority of the state. The statistical report, uploaded on December 24, highlights many weaknesses in implementing the National Food Security Act (NFSA) in state.

The soundtrack of resistance: How 'Sada Sada Ya Nabi' is fueling the Iran war

​ By Syed Ali Mujtaba*  ​The Persian track “ Sada Sada Ya Nabi ye ” by Hossein Sotoodeh has taken the world by storm. This viral media has cut across linguistic barriers to achieve cult status, reaching over 10 million views. The electrifying music and passionate rendition by the Iranian singer have resonated across the globe, particularly as the high-intensity military conflict involving Iran entered its second month in March 2026.

Lata Mangeshkar, a Dalit from Devdasi family, 'refused to sing a song' about Ambedkar

By Pramod Ranjan*  An artist is known and respected for her art. But she is equally, or even more so known and respected for her social concerns. An artist's social concerns or in other words, her worldview, give a direction and purpose to her art. History remembers only such artists whose social concerns are deep, reasoned and of durable importance. Lata Mangeshkar (28 September 1929 – 6 February 2022) was a celebrated playback singer of the Hindi film industry. She was the uncrowned queen of Indian music for over seven decades. Her popularity was unmatched. Her songs were heard and admired not only in India but also in Pakistan, Bangladesh and many other South Asian countries. In this article, we will focus on her social concerns. Lata lived for 92 long years. Music ran in her blood. Her father also belonged to the world of music. Her two sisters, Asha Bhonsle and Usha Mangeshkar, are well-known singers. Lata might have been born in Indore but the blood of a famous Devdasi family...

'Batteries now cheap enough for solar to meet India's 90% demand': Expert quotes Ember study

By A Representative   Shankar Sharma, Power & Climate Policy Analyst, has urged India’s top policymakers to reconsider the financial and ecological implications of the country’s energy transition strategy in light of recent global developments. In a letter dated April 10, 2026, addressed to the Union Ministers of Finance, Power, New & Renewable Energy, Environment, Forest & Climate Change, and the Vice Chair of NITI Aayog, with a copy to the Prime Minister, Sharma highlighted concerns over India’s ambitious plans for coal gasification and the Prototype Fast Breeder Reactor (PFBR).

Health Day ads spark row as NAPi targets Britannia campaign, criticizes celebrity endorsement

By A Representative   The advocacy group Nutrition Advocacy in Public Interest (NAPi) has raised concerns over what it describes as misleading advertising of ultra-processed food products (UPFs), particularly those high in sugar, fat and salt, calling for stricter regulations and an end to such promotions across media platforms.