Site Search


1. Natural Language Processing

The present decade has seen a blossoming of open NLP tools and applications. Recent advances in theoretical underpinnings and language representations will act as a driving force that will enable language understanding and industrialize further the language landscape. Currently (mid 2015) hottest questions in are:

  • How to understand natural languages and design natural language user interfaces.
  • How to extract actionable intelligence from social media or customer feedback e.g. sentiment analysis, named entity extraction?
  • How to find a suitable topic name (title) for a given text?
  • What are current benchmark datasets for relationship extraction?
  • What model or machine learning algorithms can I use to do corpus-oriented writing?
  • What are some good algorithms to Categorise a sentence with a set of well known topics?
  • Why did Google share their "ngrams" data for free?
  • What are some ways I can test the error of applying a topic model to tweets, given that there is no known corpus of topic labels?
  • What applications visualize topic models as topic or concept maps?
  • When computing similarity between documents using LDA, (how) can I give more "weight" to certain topics in the comparison?
  • What startups are hiring engineers with strengths in machine learning/NLP?
  • How to model and develop Open Source Corpus for topic modeling?
  • What are the most effective open source topic analysis tools for text?
  • What is the current best grammar checker on the market and how well does it perform?
  • What is the current status of systems which can answer questions framed from a given text article?
  • How hard would it be to use NLP to auto-generate a Topic FAQ for any given topic?
  • Is it possible to use Wikipedia's ontology to auto-generate topic tags for questions on Quora that mention words that might fit under some art...
  • What are some good, active blogs for computational linguistics or NLP?
  • What is the best approach for text categorization?
  • What is the 'state of the art' in parsing a person's name?
  • What is an algorithmic way to find all smileys in a text?
  • How to used NLP in supporting Education?

2. Big Data Text Analytics

Major areas of text analytics are Text Identification, text mining, text Categorization, text clustering, Search Access, entity Relation modeling, Link Analysis, Sentiment analysis, summarization, Visualization.

Social media analytics:  It's unstructured text (twitter, facebook, etc.) that requires the use of specialized natural language processing techniques and tools to do the analysis, and it never stops coming. Together, those factors can make it a tall order for businesses to make sense of social media data.

Sentiment Analysis: Apply state-of-the-art natural language processing techniques to extract sentiment about brands, organisations, products and persons from user generated content in social media, i.e. Twitter. Senitment Analysis

Discovering Influential Users: Identify users with high potential to become popular in social media given profile and text features and applying state-of-the-art machine learning techniques.

Discovering Hidden Topics: Automatically identify topics being discussed in large document collections by applying topic modelling.

Organise and Visualize Document Collections: Enhance information access and provide exploratory search by representing large document collections using the latent topics discussed within them.


3. Machine Learning

Utilizing our expertise in materials and functional nanofibers, we are working to create solutions in the area of biological systems. We have included a brief

Machine Learning & combining inductive/deductive methods: Perhaps a little bit more vague, but for a long time, machine learning and related techniques have been applied at large scale to make sense or extract "knowledge" from messy, unstructured corpora. With resources like Linked Data and topics like Big Data growing, there is now renewed interest in how to combine deductive reasoning with inductive/statistical/heuristic methods in various areas.

Machine Learning for Education: The aim is to elicit new connections among these diverse fields, identify novel tools and models that can be transferred from one to the others, and explore novel machine learning applications that will benefit the education community. Topics of interest include learning and content analytics, scheduling, automatic grading systems, cognitive psychology, and experimental design.

Constructive Machine Learning: Constructive machine learning describes a class of machine learning problems where the ultimate goal is not finding a good model of the data but rather one or more particular instances of the domain which are likely to exhibit desired properties. While traditional approaches choose these instances from a given set of unlabeled instances, constructive machine learning is typically iterative and searches an infinite or exponentially large instance space.

Deep Learning: Deep learning is a fast-growing field of Machine Learning concerned with the study and design of computer algorithms for learning good representations of data, at multiple levels of abstraction. There has been rapid progress in this area in recent years, both in terms of methods and in terms of applications, which are attracting the major IT companies. Many challenges remain, however, in aspects like large-scale (hyper-) parameter optimization, modeling of temporal data with long-term dependencies, generative modeling, efficient Bayesian inference for deep learning, multi-modal data and models, and learning representations for reinforcement learning.

Reinforcement Learning: Reinforcement learning’s (RL) objective is to develop agents able to learn optimal policies in unknown environments by trial-and-error and with limited supervision. Recent developments in exploration-exploitation, online learning, and representation learning are making RL more and more appealing to real-world applications, with promising results in challenging domains such as recommendation systems, computer games, and robotics.

Natural Language Understanding using ML: Building systems that can understand human language—being able to answer questions, follow instructions, carry on dialogues—has been a long-standing challenge since the early days of AI. Due to recent advances in machine learning, there is again renewed interest in taking on this formidable task. A major question is how one represents and learns the semantics (meaning) of natural language, to which there are only partial answers.


4. Automated Software Engineering

Under the umberela of autoamted software engineering, knowledge representations and artificial intelligence techniques are applied to automate various phases and process of software engineering. Following are major topics of interst in ASE:

  • Data mining for software engineering
  • Domain modeling and meta-modeling
  • Human-computer interaction
  • Knowledge acquisition and management
  • Autoamted software testing, verification, and validation
  • Model-driven engineering
  • Autoamted requirements engineering
  • Autoamted software architecture and design
  • Model-based software development
  • Model transformations

5. Semantic Web

Ontology-based user and query-centric approaches to information integration and acquisition of sufficient statistics for learning from data under different access and resource constraints from heterogeneous, distributed, autonomous, ubiquitous information sources; ontology design, ontology tools, ontology-extended information sources, ontology-extended workflow components, ontology-extended agents and services, semantic workflow composition. Following are the main topics of research in area of semantic web:

Scalable reasoning: reasoning has always been a hot topic in the SW community, but now there is increasing demand for reasoners that can scale into the billions of triples and handle messy data. Distributed reasoning has been a hot topic for a while, as have new languages (like the profiles of OWL), and ways to deal with provenance.

SPARQL performance: also, there has been increased demand for high performance SPARQL engines. This continues to be a hot topic, and I would expect to see a lot of papers tackling scalability and performance aspects of SPARQL 1.1's new features.

Instance matching / entity matching / consolidation / linking: we now have lots of data on the Web about all sorts of instances. As such, mechanisms to (semi-)automatically link descriptions of related (or possibly even equivalent) resources is becoming more and more important (more so, I feel, than the more traditional ontology matching field).

Linked Data "science": How can we interact with Linked Data? How can we consume it? How can be link it? How dynamic is it? How big is it? How useful is it? How correct is it? What kind of quality can we expect? Linked Data topics are seeing more and more attention.

Semantic sensor streams: If more and more devices are connected to the internet, how can semantic tech. help to make sense of the data streams they produce? How can we reason and query over such data? Topics include temporal reasoning, window-based querying/continuous SPARQL querying, etc.


6. Satellite Image Analysis

Professional satellite image processing and interpretation services are employed by governments, militaries and various agencies and authorities to detect and identify relevant objects within an image. Whether covering vast areas or specific sites, satellite imaging is a particularly useful medium for:

  • Weather prediction
  • National security and defense
  • Maintaining law and order
  • Regional development planning, zoning and monitoring
  • Emergency planning and situation management
  • Many other types of management