Major Research Projects at UT-Austin (Aug’10 - May’12)

1. Localized Simultaneous Clustering and Classification

Advisors - Prof. J Ghosh, UT-Austin

          I developed a generative model to discover smaller localized clusters in big heterogeneous datasets to achieve better classification accuracy. E.g. the set of consumers of a product/service is generally not a monolithic big group, rather it is a group of small clusters of consumers that have similar purchasing behavior. The graphical model captures this property by infering the cluster probabilities using independent as well as dependent variables. Logistic Regression with LASSO penalty was used as the objective function but the proposed model can work for any standard family of objective functions (other GLMs) to infer class labels. The model gives not only better accuracy but also insights into the set of features that affect purchasing behavior of different consumer-clusters the most. The model was trained and tested on APPLEBEE’S restaurant survey data where the task was to predict whether a person will visit APPLEBEE’s or not given his demographic and other features. The model not only performed well, beating most of the standard classification methods but it also provided insights as to which individual groups prefer what kind food in APPLEBEE’s and what improvements APPLEBEE’s needs to undertake to increase its customer base. The model also performed quite well on various UCI datasets as well as CDC health care data set on infant mortality and diabetes. This work was done under the supervision of Prof. Joydeep Ghosh (Dept. of ECE, UT-Austin).

2. Supervised Language Modeling for Temporal Resolution of Texts

(Masters' Thesis)

Advisors - Prof. J Ghosh, Prof. M. Lease, Prof. J. Baldridge, UT-Austin

          The learning task is to predict the publication date of text based solely on the words present in it. We designed a language model which among other things used asymmetric divergence families such as KL, Expected Squared Hellinger Distance etc. to predict the date of publications. Wikipedia biographies are used as the training set and Guttenberg Short Stories were used as validation and test set. The work has been accepted at CIKM 2011, Glasgow, UK, pp 0846-0849.

3. Topic Discovery for Attribute Extraction

Advisors - Prof. J Ghosh, Prof. M. Lease, Dr. R. Chatwin, Adchemy Inc.

          I designed a WordNet based Model to extract attributes of a product for search engine ads creation. The input of the model is the catalog name entries of product (for e.g. furniture catalog) and the output is different clusters of words that represent different family of products (for e.g. chair, table, mattress etc. in case of furniture) or different attributes of the product (for e.g. color, material, style etc. of the furniture). This model not only takes into account the word co-occurrences but also the semantic similarity among the words through Wordnet. The model beats Topic Model for this task. Moreover, for each topic discovered, the word with the highest class-conditional probability can be used as its topic label.

Professional Projects at Oracle (July'07-July'10)

1. Graphical Visualization Of Oracle Job Scheduling

         Designed and developed a Graphical Visualization/Editing Tool for Oracle Job Scheduling Apis. This tool graphically depicts all the existing first class objects in the scheduler (e.g. Jobs, Chains, Job Schedules, Job Classes, Windows etc.) and the relationship between them. For Chains it showed different chain steps in the chain and gave the user option to add/delete/edit a chain step in the chain graphically without the need of writing a single query.
         Dragging and dropping one object over another (e.g. a Job over a Job Class) created a well-defined relationship between the two objects, saving the user from writing any queries; user can do all the jobs graphically. The code was written in Java utilizing JGraphX Visualization Apis.

2. Db2 sql to Oracle Pl/Sql Migration

         Designed and implemented Sql Parser for the Db2 Sql in order to Translate it to Oracle Pl/Sql. Wrote the Lexer, Parser and Walker which tokenize, recognize and translate respectively. Contributed in the development of metadata capture phase of the overall translation. The parser uses Antlr3.1 as its grammar syntax and the Translation Actions are written in Java. This Translator and Migration Framework is now part of Oracle SqlDeveloper 2.1 release.

3. BackGround Task Framework

         As a team Member of Oracle SqlDeveloper designed and developed the BackGround Task frame Work which is part of its 2.1 release. Every non-gui task, which is spawned , now runs in a background thread providing user access to do other gui operations. This minimizes the gui freeze scenarios and systemizes the use of multithreaded application in SqlDeveloper. Written in Java it utilizes the threading framework in JAVA 1.6: Futures.

4. T-sql to Oracle Pl/Sql Migration

         Developed the metadata capture phase of T-sql Migration . Contributed in the development of the Parser for T-sql. The T-sql parser was released with Oracle SqlDeveloper 1.5.

5. Dbms Output & Owa Output

         Designed and developed the gui framework for showing the Dbms & Owa Output in Oracle Sql Developer. This is now part of Oracle 2.1 SqlDeveloper.

Major Academic Projects at IIT-G (2003 - 2007)

1. Event Processing Engine for Stream Databases

(B.Tech. Thesis)


Advisor - Dr G. Sajith, Prof., Department of CSE, IIT Guwahati

         Designed and implemented an Engine for analyzing Stream queries on a real time basis. It is a multithreaded engine, which parses the incoming events using ANTLR tool and stores the results after analyzing each event for every query fired in the engine. It also maintains a time window of time-width t storing last n events. This engine is simulated on a real world model of sensors and stock tickers. The code is primarily in JAVA with Parser grammar written in ANTLR. The multiple stream sources were simulated using multiple threads in Java.
         The Engine showed a real time graphical visualization of all the stream sources (e.g. sensors) that are in the network. This is implemented using Prefuse Visualization toolkit.

2. Security and Privacy In Ultra Low Power Devices

(Term Paper)


Advisor - Dr S. Nandi, Prof., Department of CSE, IIT Guwahati

         Presented a poster on the various cryptographic algorithms used in Radio Frequency Identification (RFID) and other low power devices . The poster also highlighted some suggestion that can be used to reduce cost and save energy in the RFID chips.

3. 3-D walk using OpenGl library


Advisor - Dr S. V. Rao, Prof., Department of CSE, IIT Guwahati

         Designed and developed a 3-D walk using OpenGl library in Visual Basic. It has collision detection and produces solid objects and surfaces using boundary representation. It tries to emulate a walk inside a single corridor/gulley with windows and solid objects

4. Windows Based Mail Alerter



         Designed and Developed a general purpose Windows based mail alerter, which hides as a tray icon and pops up messages whenever a new mail comes or if a reminder was due.

5. H.M.C. Student Reg. Software


Advisor - Dr G. Sajith, Prof., Department of CSE, IIT Guwahati

         Designed and developed a web based database management system (Student Registration Software) for the Hostel Management Committee (H.M.C.) of IIT-Guwahati. Oracle is used as the database and web interface runs in JSP over TomCat apache server. The database satisfied all the integrity and relational constraints and was compatible with all the queries in its domain.

6. LAN based Chat Server


Advisor - Dr P. K. Das, Prof., Department of CSE, IIT Guwahati

         Designed and developed a LAN Based Chat Server in JAVA. A limited number of user (maximum up to 6) connected to the Server on a Local Area Network and could chat through it. The server stored their conversation in a database. This was done using Java Network APIs.

7. Nachos 4.0 Implementation


Advisor - Dr G. Barua, Prof., Department of CSE, IIT Guwahati

         Developed a fully functional multi-programming Operating System running on MIPS simulator starting from a bare-bones kernel as part of the operating systems course. Implemented various pThreads libraries and simulated standard starvation problems and their solution.

8. Java Applet Tank Game


Advisor - Dr P. K. Das, Prof., Department of CSE, IIT Guwahati

         Designed and developed a multithreaded and multilevel tank game with collision detection and event listeners. It utilized Java Applet and Java AWT APIs. This game has one user tank, which can be maneuvered to kill enemy tanks by firing shells. A shell hitting an enemy tank makes it explode and vanish from the arena.

9. Numerical Solutions to BVPs


Advisor - Dr S. Natesan, Prof., Dept. of Mathematics, IIT Guwahati

         Solved Boundary Value Problems (BVPs) like Convection Diffusion , Heat Diffusion etc. numerically. It uses mesh technique to divide the domain and compute the function values through matrices. Programming was done using Matlab.

10. 4-bit CPU


Advisor - Dr S. B. Nair, Prof., Department of CSE, IIT Guwahati

         Designed and implemented a 4-bit CPU with ALU, Register Set, Main Memory and an instruction set of 16 assembly language instructions. Minimal number of primitive chips was used, and maximum efficiency was achieved with respect to time taken to execute each instruction. Wrote and ran assembly language programs on the CPU, including recursive procedures. Microprogramming was used to implement the architecture.

11. Automation of IIT Guest House


Advisor - Dr P. K. Das, Prof., Department of CSE, IIT Guwahati

         Designed and developed an efficient, user-friendly software to automate the Guest House of IITG, providing features such as allocation or reallocation of rooms to the new guests, auto-alerts when all the rooms in the Guest House are full, easy online payments etc. The software was designed using CASE tools (Rational Rose) utilizing UML diagrams and developed on a Visual Basic.NET Platform. The database was an instance of MS Access