Archive for the ‘Language’ Category

Augmenting Human Innovation with Social Cognition

Social Media is everywhere: photos, videos, news, blogs, art, music, games… even business, finance, healthcare, government, design, and other serious applications are going social. These social media gave given rise to Social Cognition. What began with sharing has moved to creation. Consumers have become producers, and commerce has become a conversation.

Due to these conversations, individuals are no longer alone; whether you’re making a life decision, solving a critical business problem, or merely looking for a restaurant, your social graphs are available to augment your decision making process. These graphs have no geographic boundaries; professional networks are worldwide, and information streams from far corners of the globe into the palm of your hand.

Beyond media and commerce, the next big disruption is innovation. Humans everywhere want to innovate, and Social Cognition can augment human innovation in many everyday and expert domains.

I discuss three human capabilities that are amenable to social augmentation: problem solving, learning, and creativity. I illustrate them with challenge problems from my work: 1) healthcare: helping consumers find relevant health information without search; 2) energy: helping experts troubleshoot complex turbine failures; 3) learning: scaling education to a hundred million people; and 4) creativity: enabling average users to create artificial intelligence agents without programming, and 2) learning: scaling education to a hundred million people.

These technologies blend Cognitive Systems (artificial intelligence) and Cognitive Science (human cognition) in products that both exhibit and support cognition in large-scale social communities. This research not only provides scientific insight but also creates disruptive business opportunities.

Invited talk at PARC, Palo Alto, CA, April 7, 2011.
 
Invited talk at Wright State University, Center of Excellence in Human-Centered Innovation, Dayton, OH, October 24, 2010.
 

View the slides:

Intentional analysis of medical conversations for community engagement

With an explosion in the proliferation of user-generated content in communities, information overload is increasing and quality of readily available online content is deteriorating. There is an increasing need for intelligent systems that make use of implicit user-generated knowledge in communities for community engagement. We describe our approach based on modeling user utterances in communities to proactively target the community for exchange of questions and answers. We envision a system that automatically encourages user engagement and participation by routing relevant conversations to users based on individual and community activity levels.

In this paper, we analyze health forum conversations from WebMD, a popular health portal consumer site, and classify them in different acts of speech using Verbal Response Modes (VRM) theory. We describe our approach for modeling an intelligent community recommender to engage participants based on observations from our analysis.

Read the paper:

Intentional analysis of medical conversations for community engagement

by Saurav Sahay, Hua Ai, Ashwin Ram

FLAIRS-11 International Conference on Artificial Intelligence
www.cc.gatech.edu/faculty/ashwin/papers/er-11-01.pdf

Conversational Framework for Web Search and Recommendations

We introduce a Conversational Interaction framework as an innovative and natural approach to facilitate easier information access by combining web search and recommendations. This framework includes an intelligent information agent (Cobot) in the conversation to provide contextually relevant social and web search recommendations. Cobot supports the information discovery process by integrating web information retrieval along with proactive connections to relevant users who can participate in real-time conversations. We describe the conversational framework and report on some preliminary experiments in the system.

Read the paper:

Conversational Framework for Web Search and Recommendations

by Saurav Sahay, Ashwin Ram

ICCBR-10 Workshop on Reasoning from Experiences on the Web (WebCBR-10), Alessandria, Italy, 2010.
www.cc.gatech.edu/faculty/ashwin/papers/er-10-01.pdf

Collaborative Information Access: A Conversational Search Approach

Knowledge and user-generated content is proliferating on the web in scientific publications, information portals and online social media. This knowledge explosion has continued to outpace technological innovation in efficient information access technologies. In this paper, we describe methods and technologies for “Conversational Search” as an innovative solution to facilitate easier information access and reduce the information overload for users.

Conversational Search is an interactive and collaborative information finding interaction. The participants in this interaction engage in social conversations aided with an intelligent information agent (Cobot) that provides contextually relevant search recommendations. The collaborative and conversational search activity helps users make faster and more informed search and discovery. It also helps the agent learn about conversations with interactions and social feedback to make better recommendations. Conversational search leverages the social discovery process by integrating web information retrieval along with the social interactions.

Read the paper:

Collaborative Information Access: A Conversational Search Approach

by Saurav Sahay, Anu Venkatesh, Ashwin Ram

ICCBR-09 Workshop on Reasoning from Experiences on the Web (WebCBR-09), Seattle, July 2009
www.cc.gatech.edu/faculty/ashwin/papers/er-09-05.pdf

Using Content Analysis to Investigate The Research Paths Chosen by Scientists over Time

We present an application of a clustering technique to a large original dataset of SCI publications which is capable at disentangling the different research lines followed by a scientist, their duration over time and the intensity of effort devoted to each of them. Information is obtained by means of software-assisted content analysis, based on the co-occurrence of words in the full abstract and title of a set of SCI publications authored by 650 American star-physicists across 17 years. We estimated that scientists in our dataset over the time span contributed on average to 16 different research lines lasting on average 3.5 years and published nearly 5 publications in each single line of research. The technique is potentially useful for scholars studying science and the research community, as well as for research agencies, to evaluate if the scientist is new to the topic and for librarians, to collect timely biographic information.

Read the paper:

Using Content Analysis to Investigate The Research Paths Chosen by Scientists over Time

by Chiara Franzoni, Chris Simpkins, Baoli Li, Ashwin Ram

Scientometrics journal 83(1):321-335, April 2010. (Earlier version in 1th International Conference on Scientometrics and Infometrics (ISSI-07), Madrid, Spain, June 2007.)
www.springerlink.com/content/5462n515405715u2/?p=8344e997766b4ecdabee78f5e27a9faa&pi=18
www.cc.gatech.edu/faculty/ashwin/papers/er-07-06.pdf

NLP: Not (Just) Language, People

As consumers become producers and, now, participants in online social communities, there are new opportunities and challenges in the increasing amounts of textual information and interactions on the web, within enterprises, in government, and in new types of social media and virtual worlds.

Natural Language Processing (NLP) researchers have traditionally regarded language as the object of study. In this talk, I argue that NLP is as much a study of people as of language per se. Doing NLP well requires us to model and reason about Content (domain knowledge), Context (goals and tasks), and Community (social context). I discuss why modeling the three C’s is difficult, and illustrate some approaches to these problems using examples from my recent academic and commercial projects.

Invited talk at PARC (Palo Alto Research Labs), Palo Alto, CA, January 2009

iReMedI – Intelligent Retrieval from Medical Information

Effective encoding of information is one of the keys to qualitative problem solving. Our aim is to explore Knowledge Representation techniques that capture meaningful word associations occurring in documents. We have developed iReMedI, a TCBR-based problem solving system as a prototype to demonstrate our idea. For representation we have used a combination of NLP and graph based techniques which we call as Shallow Syntactic Triples, Dependency Parses and Semantic Word Chains. To test their effectiveness we have developed retrieval techniques based on PageRank, Shortest Distance and Spreading Activation methods. The various algorithms discussed in the paper and the comparative analysis of their results provides us with useful insight for creating an effective problem solving and reasoning system.

Read the paper:

iReMedI – Intelligent Retrieval from Medical Information

by Saurav Sahay, Bharat Ravisekar, Anu Venkatesh, Sundaresan Venkatasubramanian, Priyanka Prabhu, Ashwin Ram

9th European Conference on Case-Based Reasoning (ECCBR-08), Trier, Germany
www.cc.gatech.edu/faculty/ashwin/papers/er-08-05.pdf

Subjectivity Analysis for Questions in QA Communities

In this paper we investigate how to automatically determine the subjectivity orientation of questions posted by real users in community question answering (CQA) portals. Subjective questions seek answers containing private states, such as personal opinion and experience. In contrast, objective questions request objective, verifiable information, often with support from reliable sources. Knowing the question orientation would be helpful not only for evaluating answers provided by users, but also for guiding the CQA engine to process questions more intelligently. Our experiments on Yahoo! Answers data show that our method exhibits promising performance.

Read the paper:

Subjectivity Analysis for Questions in QA Communities

by Baoli Li, Yandong Liu, Ashwin Ram, Ernie Garcia, Eugene Agichtein

31st Annual International ACM SIGIR Conference (ACM-SIGIR-08), Singapore, July 2008
www.cc.gatech.edu/faculty/ashwin/papers/er-08-02.pdf

Discovering Semantic Biomedical Relations Utilizing The Web

To realize the vision of a Semantic Web for Life Sciences, discovering relations between resources is essential. It is very difficult to automatically extract relations from Web pages expressed in natural language formats. On the other hand, because of the explosive growth of information, it is difficult to manually extract the relations. In this paper we present techniques to automatically discover relations between biomedical resources from the Web. For this purpose we retrieve relevant information from Web Search engines and Pubmed database using various lexico-syntactic patterns as queries over SOAP web services. The patterns are initially handcrafted but can be progressively learnt. The extracted relations can be used to construct and augment ontologies and knowledge bases. Experiments are presented for general biomedical relation discovery and domain specific search to show the usefulness of our technique.

Read the paper:

Discovering Semantic Biomedical Relations utilizing the Web

by Saurav Sahay, Sougata Mukherjea, Eugene Agichtein, Ernie Garcia, Sham Navathe, Ashwin Ram

ACM Transactions on Knowledge Discovery from Data, 2(1):3, 2008
www.cc.gatech.edu/faculty/ashwin/papers/er-08-01.pdf

Semantic Annotation and Inference for Medical Knowledge Discovery

We describe our vision for a new generation medical knowledge annotation and acquisition system called SENTIENT-MD (Semantic Annotation and Inference for Medical Knowledge Discovery). Key aspects of our vision include deep Natural Language Processing techniques to abstract the text into a more semantically meaningful representation guided by domain ontology. In particular, we introduce a notion of semantic fitness to model an optimal level of abstract representation for a text fragment given a domain ontology. We apply this notion to appropriately condense and merge nodes in semantically annotated syntactic parse trees. These transformed semantically annotated trees are more amenable to analysis and inference for abstract knowledge discovery, such as for automatically inferring general medical rules for enhancing an expert system for nuclear cardiology. This work is a part of a long term research effort on continuously mining medical literature for automatic clinical decision support.

Read the paper:

Semantic Annotation and Inference for Medical Knowledge Discovery

by Saurav Sahay, Eugene Agichtein, Baoli Li, Ernie Garcia, Ashwin Ram

NSF Symposium on Next Generation of Data Mining (NGDM-07), Baltimore, MD, October 2007
www.cc.gatech.edu/faculty/ashwin/papers/er-07-16.pdf