Personal tools
You are here: Home Publications MSc and PhD Theses
Document Actions

MSc and PhD Theses

Up one level

(Unless otherwise stated, all texts are in Portuguese)

An Event-Based Approach to Process Environmental Data
 
Management of integrity constraints for multi-scale geospatial data
 
Novas Técnicas de Distribuição de Carga para Servidores Web Geograficamente Distribuídos
A distribuição de carga é um problema intrínseco a sistemas distribuídos. Esta tese aborda este problema no contexto de servidores web geograficamente distribuídos. A replicação de servidores web em \emph{datacenters} distribuídos geograficamente provê tolerância a falhas e a possibilidade de fornecer melhores tempos de resposta aos clientes. Uma questão chave em cenários como este é a eficiência da solução de distribuição de carga empregada para dividir a carga do sistema entre as réplicas do servidor. A distribuição de carga permite que os provedores façam melhor uso dos seus recursos, amenizando a necessidade de provisão extra e ajudando a tolerar picos de carga até que o sistema seja ajustado. O objetivo deste trabalho foi estudar e propor novas soluções de distribuição de carga para servidores web geograficamente distribuídos. Para isso, foram implementadas duas ferramentas para apoiar a análise e o desenvolvimento de novas soluções, uma plataforma de testes construída sobre a implementação real de um serviço web e um software de simulação baseado em um modelo realístico de geração de carga para web. As principais contribuições desta tese são as propostas de quatro novas soluções de distribuição de carga que abrangem três diferentes tipos: soluções baseadas em DNS, baseadas em clientes e baseadas em servidores.
Management of heterogeneous climate data for applications in agriculture
 
Um modelo para recuperação por conteudo de imagens de sensoriamento remoto
Resumo: O problema da recuperação de imagens por conteúdo tem sido uma área de muito interesse nos últimos anos, com múltiplas aplicações em diferentes domínios de geração de imagens. Uma classe de imagem onde este problema não tem sido resolvido satisfatoriamente referese à classe de Sensoriamento Remoto. Imagens de Sensoriamento Remoto (ISR) são obtidas como combinação do sensoriamento da Terra em múltiplas bandas espectrais. Esta tese aborda o problema da recuperação por em conteúdo das ISR . Este tipo de recuperação parte da caracterização do conteúdo de uma imagem e uma das suas principais abordagens considera modelos matemáticos da área de Processamento de Imagens a ser abordada nesta tese. Neste trabalho, abordamos o processo de recuperação de ISR que utilizando três recursos principais: padrões de textura e cor como elemento básico da consulta, uso de múltiplos modelos matemáticos de representação e caracterização do conteúdo e um mecanismo de retroalimentação para o processo de consulta. As principais contribuições da tese são: (1) uma análise dos problemas da recuperação por conteúdo para ISR; (2) a proposta de um modelo para esta recuperação; (3) um modelo e métrica de similaridade baseado no modelo proposto; (4) proposta de implementação do processamento das consultas que mostra a viabilidade do modelo Abstract: Content-based retrieval of images is a topic of growing interest given us multiple applications. One kind of images that have not yet been dealt with satisfactorily are the so-called Remote Sensing Images. Remote Sensing Images (RSI) are a especial type of image, created by combination of sensoring on different spectral bands . This work deals with the problem of content-based retrieval of Remote Sensiong Images(RSI). It uses the image retrieval approach based on content representation models from image processing area. This work presents a content-based image retrieval model for RSI, based on three main features: patterns of color and texture as basic query concept, use of multiple content representation modelsanda feedback Televance machanism. The main contributions os these work are: (1) an analysis of content-based RSI pro,. blems; (2) a proposal of a model for RSI retrieval; (3) a proposal of model and metric for similarity measure; (4) a proposal of algorithm for processing of content-based queries
A Computational Environment for Modeling Environmental Applications
Geographic applications are intrinsically complex due to the nature of the data manipulated and also due to the processes acting over these data. Today, many of these applciations are built on top of a GIS (Geographic Information SYstem), a software that provides efficient storage, analysis and presentation tools for spatial data. Nevertheless, GISs present limitatinos that prevent users from taking full advantage of available GIS tools. These limitatins are mainly related to their interface and modeling features and also to the fact that end-users are experts in their application domain but do not have the adequate background in software engineering or database design. This thesis is a contribution to solve these two limitations, presenting UAPE - an environment for modeling and designing geographic applictations. With the environment, users will be able to design appications according to their needs, abstracting the implementation details related to the underlying GIS. The major contributions are: (a) an object-oriented model, GMOD, which supports both data and process modeling, (b) a methodology for environmental application design, and (c) and environment, UAPE, that integrates model and methodology in order to help users in environmental application modeling and design.
Busca multimodal para apoio à pesquisa em biodiversidade
A pesquisa em computação aplicada à biodiversidade apresenta muitos desafios, como a existência de grande quantidade de dados e sua heterogeneidade e variedade. As ferramentas de busca disponíveis para tais dados ainda são limitadas e normalmente só consideram dados textuais, deixando de explorar a potencialidade da busca por dados de outra natureza, como imagens ou sons. O objetivo deste projeto é analisar os problemas de realizar consultas multimodais com texto e imagem para o domínio de biodiversidade, propondo um conjunto de ferramentas para processar tais consultas. Espera-se que com esta busca integrada a recuperação dos dados de biodiversidade se torne mais abrangente, auxiliando os pesquisadores em biodiversidade em suas tarefas, além de incentivar que usuários leigos acessem esses dados. Este trabalho está inserido no projeto BioCORE, uma parceria entre pesquisadores de computação e biologia para aperfeiçoar a pesquisa em biodiversidade.
Uso de Técnicas de Aprendizagem para Classificação e Recuperação de Imagens
Ténicas de aprendizagem vêm sendo empregadas em diversas áreas de aplicação (medicina, biologia, seguran ̧a, entre outras). Neste trabalho, buscou-se avaliar o uso da técnica de Programação Genética (PG) em tarefas de recuperação e classificação imagens. PG busca soluções ó́timas inspirada pela teoria de seleção natural das espécies. Indivíduos mais aptos (melhores soluções) tendem a evoluir e se reproduzir nas gerações futuras.
Serviço Web para Anotação de Dados Geográfi cos Vetoriais e sua Aplicação em Sistemas de Informação de Biodiversidade
 
Semantic Annotation of Geospatial Data (Anotação Semântica de Dados Geoespaciais)
 
Processamento de Consultas Baseado em Ontologias para Sistemas de Biodiversidade
Sistemas de informação de biodiversidade lidam com um conjunto heterogêneo de informações providas por diferentes grupos de pesquisa. A diversificação pode ocorrer com relação às espécies estudadas, à estruturação das informações coletadas, ao local de estudo, metodologias de trabalho ou objetivos dos pesquisadores, dentre outros fatores. Esta heterogeneidade de dados, usuários e procedimentos dificulta o reuso e o compartilhamento de informações. Este trabalho contribui para diminuir tal obstáculo, melhorando o processo de consulta às informações em sistemas de biodiversidade. Para tanto, propõe um mecanismo de expansão de consultas que pré-processa uma consulta de usuário (cientista) agregando informações adicionais, provenientes de ontologias, para aproximar o resultado da intenção do usuário. Este mecanismo é baseado em serviços Web e foi implementado e testado usados dados e casos de uso reais.
Managing the Quality of Products in a Supply Chain (Gerenciamento de Regras de Qualidade de Produtos em Cadeias Produtivas)
Cadeias produtivas têm se tornado cada vez mais dependentes de sistemas computacionais. Além dos desafios científicos, há várias conseqüências econômicas. Esta dissertação trata de mecanismos de gerenciamento de regras que especificam a qualidade de produtos em cadeias produtivas sob dois aspectos: (i) a especificação e armazenamento destas regras e (ii) a análise dos eventos ocorridos na cadeia face a tais restrições. A dissertação parte de um modelo de rastreabilidade para cadeias produtivas agrícolas desenvolvido na UNICAMP. As regras de qualidade gerenciadas definem condições atribuídas a produtos de forma que eles possam ser consumidos. A verificação de regras é baseada na análise de variáveis consideradas críticas para a garantia de qualidade, que são monitoradas por sensores. Portanto, esta pesquisa combina trabalhos em gerenciamento de dados de sensores, bancos de dados ativos e restrições de integridade. As principais contribuições são: um estudo detalhado sobre rastreabilidade associada a regras de qualidade, um modelo para gerenciar a especificação, aplicação e análise dessas regras e um protótipo para validar a arquitetura. O protótipo é baseado em serviços Web e disseminação de eventos. Os estudos de caso são baseados em problemas na área de agricultura.
Estudo comparativo de descritores para recuperação de imagens por conteúdo na Web
A crescente quantidade de imagens geradas e disponibilizadas atualmente tem feito aumentar a necessidade de criação de sistemas de busca para este tipo de informação. Um método promissor para a realização da busca de imagens é a busca por conteúdo. Este tipo de abordagem considera o conteúdo visual das imagens, como cor, textura e forma de objetos, para indexação e recuperação. A busca de imagens por conteúdo tem como componente principal o descritor de imagens. O descritor de imagens é responsável por extrair propriedades visuais das imagens e armazená-las em vetores de características. Dados dois vetores de características, o descritor compara-os e retorna um valor de distância. Este valor quantifica a diferença entre as imagens representadas pelos vetores. Em um sistema de busca de imagens por conteúdo, a distância calculada pelo descritor de imagens é usada para ordenar as imagens da base em relação a uma determinada imagem de consulta. Esta dissertação realiza um estudo comparativo de descritores de imagens considerando a Web como cenário de uso. Este cenário apresenta uma quantidade muito grande de imagens e de conteúdo bastante heterogêneo. O estudo comparativo realizado nesta dissertação é feito em duas abordagens. A primeira delas considera a complexidade assintótica dos algoritmos de extração de vetores de características e das funções de distância dos descritores, os tamanhos dos vetores de características gerados pelos descritores e o ambiente no qual cada descritor foi validado originalmente. A segunda abordagem compara os descritores em experimentos práticos em quatro bases de imagens diferentes. Os descritores são avaliados segundo tempo de extração, tempo para cálculos de distância, requisitos de armazenamento e eficácia. São comparados descritores de cor, textura e forma. Os experimentos são realizados com cada tipo de descritor independentemente e, baseado nestes resultados, um conjunto de descritores é avaliado em uma base com mais de 230 mil imagens heterogêneas, que reflete o conteúdo encontrado na Web. A avaliação de eficácia dos descritores na base de imagens heterogêneas é realizada por meio de experimentos com usuários reais. Esta dissertação também apresenta uma ferramenta para a realização automatizada de testes comparativos entre descritores de imagens.
Um Serviço de Gerenciamento de Coletas para Sistemas de Informação de Biodiversidade
Biodiversity research requires correlations of data on living beings and their habitats. Such correlations can be of different types, considering factors such as spatial relationships or environmental descriptions (e.g., description of habitat and ecossystems). Biodiversity information systems are complex pieces of software that allow researchers to perform these kinds of analysis. The complexity of these systems varies with the data used, the target users, and the environment where the systems are executed. One of the problems to be faced, especially on the Web, is the heterogeneity of the data aggravated by the diversity of user vocabularies. This research contributes to solve this problem by presenting a database model that organizes the biodiversity information using consensual data standards. The proposed model combines information collected in the field with that from museum data catalogues. The model was specified with the assistance of biologists and ecologists. The database was encapsulated in a Web service to ensure transparency in using, accessing and recovering the information. The service is invoked by client applications. The database and service were tested and validated using real data, provided by the BioCore project partners. BioCore is a research project that involves computer and biology researchers from UNICAMP and USP.
Reconhecimento Semi-automático e Vetorização de Regiões em Imagens de Sensoriamento Remoto
O uso de imagens de sensoriamento remoto (ISRs) como fonte de informação em aplicações voltadas para o agronegócio é bastante comum. Nessas aplicações, saber como é a ocupação espacial é fundamental. Entretanto, reconhecer e diferenciar regiões de culturas agrícolas em ISRs ainda não é uma tarefa trivial. Embora existam métodos automáticos propostos para isso, os usuários preferem muitas vezes fazer o reconhecimento manualmente. Isso acontece porque tais métodos normalmente são feitos para resolver problemas específicos, ou quando são de propósito geral, não produzem resultados satisfatórios fazendo com que, invariavelmente, o usuário tenha que revisar os resultados manualmente. A pesquisa realizada objetivou a especificação e implementação parcial de um sistema para o reconhecimento semi-automático e vetorização de regiões em imagens de sensoriamento remoto. Para isso, foi usada uma estratégia interativa, chamada realimentação de relevância, que se baseia no fato de o sistema de classificação poder aprender quais são as regiões de interesse utilizando indicações de relevância feitas pelo usuário do sistema ao longo de iterações. A idéia é utilizar descritores de imagens para codificarinformações espectrais e de textura de partições da imagens, utilizar realimentação de relevância com Programação genética (PG) para combinar as características dos descritores. PG é uma técnica de aprendizado de máquina baseada na teoria da evolução. As principais contribuições desse trabalho são: estudo comparativo de técnicas de vetorização de imagens; adaptação do modelo de recuperação de imagens por conteúdo proposto recentemente para realização de realimentação de relevância usando regiões de imagem; adaptação do modelo de realimentação de relevância para o reconhecimento de regiões em ISRs; implementação parcial de um sistema de reconhecimento semi-automático e vetorização de regiões em ISRs; proposta de metodologia de validação do sistema desenvolvido.
Managing the lifecycle of sensor data: from production to consumption
Sensing devices are becoming widely disseminated, being applied in several domains, noticeably in scientific research. However, the increase in their number and variety introduces problems on managing the produced data, such as how to provide sensor data at distinct rates or temporal resolutions for different applications, or how to pre-process or format the data differently for each request. This work is concerned with tackling four issues that arise in the management of sensor data for scientific applications: (i) providing homogeneous access to heterogeneous sensing devices and their data; (ii) managing the composition of operations applied to sensor data; (iii) offering flexible data pre-processing facilities prior to sensor data publication; and, (iv) propagating and creating valid data annotations (metadata) throughout the data life cycle. The proposed solution to issue (i) is to uniformly encapsulate both software and data by extending a component technology called Digital Content Components (DCCs), also allowing associated annotations. Using these components as a basis, the proposed solution to (ii) is to apply scientific workflows to coordinate the combination of data and software DCCs. The solution proposed to (iii) involves invoking and posting workflow specifications from the data provider as well as using the annotations on DCCs to enrich the queries and answers. Finally, an annotation propagation mechanism is proposed as a solution to (iv). Our contributions arepresented within a framework for sensor data management, which unifies aspects of data access, pre-processing, publication and annotation.
Mining sensor data time series (Mineracão de séries temporais de dados de sensores)
Sensor networks have increased the amount and variety of temporal data available. This motivated new techniques in data mining, which describe different aspects of time series. Related work addresses several issues, such as indexing and clustering time series, and the definition of more efficient feature vectores and distance functions. However, most results focus on describing the values in a series, and not their evolution. Furthermore, the majority of papers only characterizes a single series, which is not enough in cases where multiple kinds of data must be considered simultaneously. This dissertation presents a new technique, which describes time series using a distinct approach, characterizing oscillation patterns, rather than the values themselves. The new descriptor -- TIDES (Time Series Oscillation Descriptor) -- is based on approximating the series by segments, and then extracting the angular coefficients of the segments. TIDES supports multi-scale analysis, thereby allowing to compare two series according to distinct granularities, which enables a more thorough analysis. The dissertation also presents several extensions to TIDES, which enable describing multiple series at a time. This joint description is needed to correctly characterize phenomena which evolve jointly -- the so-called co-evolution. Experiments conducted with real data on temperature, for different Brazilian cities, show that TIDES successfully characterizes time series oscillation.
Sistema de operações em álgebra relacional não-normalizada (A system to support operations in non-normalized relational algebra)
 
Access Control in Multiversion Geographic Databases
Geographic applications are increasingly influencing our daily activities. Their development requires efforts from multiple teams of experts with different views and authorizations to access data. As a result, several mechanisms have been proposed to control authorization in geographic databases or to provide the use of versions. These mechanisms, however, work in isolation, prioritizing only either data access or versioning systems. This dissertation addresses this issue, by proposing a unified authorization model for databases that faces both problems. The model deals with the access control issue in geographic databases, taking into account the existence of data versioning mechanisms. This model may serve as the basis for cooperative and secure work in applications that use Geographic Information Systems (GIS).
Aondê: Um Serviço Web de Ontologias para Interoperabilidade em Sistemas de Biodiversidade (Aondê: An Ontology Web Service for Interoperability across Biodiversity Information Systems)
Biodiversity research requires associating data about living beings and their habitats, constructing sophisticated models and correlating all kinds of information. Data handled are inherently heterogeneous, being provided by distinct (and distributed) research groups, which collect these data using different vocabularies, assumptions, methodologies and goals, and under varying spatio-temporal frames. This poses many kinds of challenges in Computer Science research, from the physical (e.g., diversity of storage structures) to the conceptual level (e.g., diversity of perspectives and of knowledge domains). The adoption of ontologies has been proposed as a means to help solve heterogeneity issues. However, this kind of solution gives birth to new research issues, since it implies handling problems in ontology design, management and sharing. This dissertation presents a new kind of Web Service whose goal is to help in solving such issues. Aondê (which means "owl" in Tupi, the main branch of native Brazilian languages) is a Web Service that provides a wide range of operations for storage, management, search, ranking, analysis and integration of ontologies. The text covers the specification and implementation of Aondê, which have been validated by a prototype tested with large ontologies and real biodiversity case studies.
Descritores de Forma baseados em Tensor Scale
In the past few years, the number of image collections available has increased. In this scenery, there is a demand for information systems for storing, indexing, and retrieving these images. One of the main adopted solutions is to use content-based image retrieval systems (CBIR), that have the ability to, for a given query image, return the most similar images stored in the database. To answer this kind of query, it is important to have an automated process for content characterization and, for this purpose, the CBIR systems use image descriptors based on color, texture and shape of the objects within the images. In this work, we propose shape descriptors based on Tensor Scale. Tensor Scale is a morphometric parameter that unifies the representation of local structure thickness, orientation, and anisotropy, which can be used in several computer vision and image processing tasks. Besides the shape descriptors based on this morphometric parameter, we present a study of algorithms for Tensor Scale computation. The main contributions of this work are: (i) study of image descriptors based on color, texture and shape descriptors; (ii) study of algorithms for Tensor Scale computation; (iii) proposal and implementation of a contour salience detector based on Tensor Scale; (iv) proposal and implementation of new shape descriptors based on Tensor Scale; and (v) validation of the proposed descriptors with regard to their use in content-based image retrieval systems, comparing them, experimentally, to other relevant shape descriptors, recently proposed.
Management of Bioinformatics Scientific Workflows (partially in portuguese)
Bioinformatics activities are growing all over the world, following a proliferation of data and tools. This brings new challenges, such as how to understand and organize these resources, how to exchange and reuse successful experimental procedures (tools and data), and how to provide interoperability among data and tools across different sites, and used for users with distinct profiles. This thesis proposes a computational infrastructure to solve these problems. The infrastructure allows to design, reuse, annotate, validate, share and document bioinformatics experiments. Scientific workflows are the mechanisms used to represent these experiments. Combining research on databases, scientific workflows, artificial intelligence and semantic Web, the infrastructure takes advantage of ontologies to support the specification and annotation of bioinformatics workflows and, to serve as basis for traceability mechanisms. Moreover, it uses artificial intelligence planning techniques to support automatic, iterative and supervised composition of tasks to satisfy the needs of the different kinds of user. The data integration and interoperability aspects are solved combining the use of ontologies, structure mapping and interface matching algorithms. The infrastructure was implemented in a prototype and validated on real bioinformatics data.
Implementation of a Temporal System in an Object Oriented Database
Many temporal data models have been suggested. A great number of these models is based on incorporating time only for relational database systems. However, the applications that require temporal data managment presents a object-oriented nature. Research on object-oriented database systems is still in its initial phase. This work presentes a practical contribution to the research in this area. This contribution consists in the development of a temporal data management system for an object oriented database. This system -The temporal Management Layer- was built on top of the O2 database system and allows the definition and management of object oriented temporal data, as well as the processing of temporal queries.
An Infrastructure based on Web Service Choreography for Activity Coordination in Supply Chains
A supply chain is the set of activities involved in the creation, transformation and distribution of a product, from raw material to the consumer. Supply chain's participants can work in an integrated way to optimize their performance and increase their commercial competitiveness. From the technological point of view, the distributed, autonomous and heterogeneous nature of supply chain's participants raises difficulties when we consider the automation of interorganizational processes. This work proposes an infrastructure based on Web services choreographies for coordination of the activities that compose the interorganizational business processes of the supply chains. This infrastructure implements a coordination model that aims to make easier the design and the deployment of the interorganizational business processes. In this model, processes are represented by WS-CDL choreographies, which are mapped to executable BPEL coordination plans. The work also presents a prototype of the infrastructure to validate it.
User Interface Construction: specifying and implementing dialogue control using statecharts
A variety of techniques exists for the specification and the implementation of human-computer interface control, i.e., techniques to describe and implement the sintax of user's actions, of the reactions of a computer and of how the dialogue between a user and a computer evolves along a period of time. These techniques, however, present some drawbacks. This work concentrates on representation and implementation of a dialogue sintax based on the statechart notation. A statechart seems suitable to describe this kind of behaviour. It extends state transition diagrams and overcomes some of the shortcomings of the later. The use of the statechart notation in the development of a realistic interface led to improvements wich have been made in order to apply it specifically in this context. This use showed the need of some kind of supported at the presentation level and of some changes of notation. The observation of the code generated by an already existing tool to implement statechart behaviour give us some insights as well about desired elements in relation to the structure of the generated code.
Data Model Transparency in Heterogenous Database Systems
Heterogeneous Database Systems (HDBSs) integrate, in a cooperative environment, autonomous and heterogeneous database systems (DBSs). Model transparency in HDBSs is an important property that allows the users to deal with global data using a single model and database language. This work proposes and discusses solutions to support such property in HDBSs built through the integration of network DBSs and relational DBSs. The solutions presented include methodologies for schema conversion and, architectures and algorithms for command transformation. The approach used in this work differs from others published in two main points. First, it assumes that each user will manipulate global data using the data model and database language he was supposed to use before the HDBS exist. Second, it proposes mechanisms to support access to HDBS's data through application programs instead of ad-hoc transactions.
Integrity Constraints Maintenance in Object-Oriented Databases
This thesis analyses the problem of static integrity constraints in object oriented database systems, using production rules and the paradigm of active databases. This work shows how to automatically change constraints into production rules, based on information from the constraints and the DBMS schema. The algorithm for rule generation was implemented, and can be used not only for constraint maintenance in object oriented database systems, but also for relational and nested database systems, being of general use. The research developed here includes the specification of a taxonomy for constraints in object oriented systems which considerates their dynamic dimension, and the definition and implementation of a language for constraint specification to facilitate their processing. This work extends proposals of other authors, implementing support for constraints, not only about data, but also about methods.
Management of traceability in food supply chains
A supply chain is a set of activities developed starting from raw materials to final consumers. Supply chains present many research challenges in Computing, such as the modeling of their processes, communication problems between their components, logistics or process and product management. An issues of increasing importance is to enable traceability to ensure the origin and quality control of the products. However, little has been published on implementation aspects to solve this problem. Most papers are related to specific aspects, and do not strive for generic solutions. This work contributes to fill this gap, considering product, process and service traceability within a supply chain. The main contributions are a model for traceability data storage, supported by Web Services-based architecture. This work presents was validated by a prototype, whose tests shows the genericity of the solution.
Design and Implementation of User Interfaces for Geographic Applications Systems
This thesis presents a framework of techniques and models to support the design and implementation of user-interfaces for geographic information systems (GIS). The proposal combines concepts from three areas of computer science -- Databases, Software Engineering and Human-Computer Interfaces -- in a innovative perspective, considering interactions not only with the user, but also with the underlying software. The framework covers both the architecture of the interface and the mechanisms for its construction. The basis of the interface-GIS integration is an object-oriented geographic database. The presented solution can be mapped to most of the existing interface development tools. The main results of the thesis are: a software architecture for the design and implementation of user-interfaces for geographic applications systems; an interface objects model for building user-interfaces which can be modified at run-time (dynamic interfaces); an interface customization mechanism based on active databases; and the creation of reusable interface components geared towards geographic applications. The techniques and tools introduced in this thesis were applied on the design and implementation of user-interfaces for two geographic applications systems, in urban and environmental areas. The results of this experience showed that this work contributes to diminishing the costs and improving the efficiency of the development of geographic interfaces.
Using hypermedia models in digital libraries for geographic data
This dissertation presents a model and a methodology for the construction of digital libraries. A digital library was here considered to be a hypermedia application, based on a Object Oriented Hypermedia DBMS environment. Model and methodology were used to model a specific application -- a Geographic Digital Library, whose goal is to collect and provide access to a large volume of geographic and conventional data. The construction of this library demanded the definition of a special set of metadata, which aggregates several existing standards. The geographic digital library contemplates two modes of interaction: browsing (in the traditional sense) and querying (supported by the underlying DBMS). The model integrates the OOHDM database model of Milet et al, with the Extended Dexter model, and applies extensions to this integration. The methodology extends the proposal of OODHM, adapting it to allow modelling of digital libraries. The main contributions of the dissertation are: (a) A detailed survey of requirements of digital libraries, and of hipermedia data and authoring models, presented in a unified taxonomy; (b) An object oriented hipermedia model for digital libraries; (c) A methodology which uses the model for construction of such libraries; and (d) Detailed specification of how to build geographic digital libraries, using model and methodology.
An Object-Oriented Method for Developing Distributed Information Systems
With the increasing availability of distributed technologies, large companies have been pursuing more and more the development of distributed information system. However, there is a lack of methods that consider the distribution aspect from its initial phase (requirement analysis) to its final phase (implementation). Indeed distributed architecture specifications (e.g., OMG's CORBA) provide support only to the activities related to the software implementation process (the analysis process is not considered). This work presents an object-oriented (OO) method for developing distributed information systems which integrates concepts used on conceptual models of OO methods with concepts used in distributed architecture specifications. This integration provides a better usage of nowaday's distributed technology (e.g., distributed databases, internet, intranet, etc.). During the analysis phase of this method, objects are grouped in subsystems based on the affinity that exists among them. This grouping process is conducted in order to induce better performance for the distributed information system. Finally, the work proposes a tool that automates the object grouping process.
Characterization of Spatial Database Systems for Performance Analysis
In this thesis, the area of spatial database systems is approached using the benchmark technique for performance analysis. This technique requires the monitoring of a database system using a real or synthetic database and workload (transactions). The ideal situation is the use of synthetic data that better resemble the situations found in real applications. This thesis uses real data (spatial and non-spatial) of a telecommunications outside plant management system to validate and enhance techniques to provide more realistic synthetic data and workload and to derive conclusions useful for performance studies of database systems in general.
Version Management in Databases for GIS
Information systems contemplate version models and mechanisms for the management of multiple states of modeled entities. Versions are associated, mainly, to the management of alternatives in CAD/CASE systems and the representation of historical evolution of entities in temporal systems. This dissertation studies the use of versions in Geographic Information Systems (GIS). The focus of this work is on temporal applications, multiple representations of spatial entities, and the management of alternatives of spatial design. The main results presented are: a model and a mechanism for versions in order to support geographic applications; and the proposal of an extension to a standard OODBMS to support the model.
Views in GIS - a model and mechanisms
This thesis analyses the functionality offered by view mechanisms in order to satisfy specific GIS needs. The main results presented are: (1) a detailed analysis of the role views can play in the GIS context; (2) the specification of an object oriented view model to be used in GIS, which shows the need for additional data and semantic information in order to support the required functionality; (3) the presentation of a mechanism to support the model; and (4) a language to specify views in this model. The work developed is validated through the modelling of a real world application using the model and language proposed.
Comparative Analysis of the Use of Relational and OO Models in GIS
Geographical Information Systems (GIS) are known as non-conventional applications, and so they need different methodologies from those used for conventional applications. However, most of existent GIS in the market use conventional tools. The goal of this thesis is to verify, through an example - UNINet, the use of ER and object oriented models (OMT) in a GIS. THe modeling and implementation in relational (SQL92) and object oriented DBMS (O2) are analyzed, deriving the results.
Active Database System Support for Topological Constraints in Geographical Information Systems
This dissertation concerns the use of active databases in geographic applications. The results presented here extend the active database systems paradigm to solve the problem of maintaining spatial (topological) constraints. The solution for this problem is divided in three steps: i) topological constraint specification; ii) translation of the constraint into rules; and iii) automatic constraint maintenance, using the generated rules. This approach was used in the development of an active system prototype that incorporates an object-oriented geographic model, thus removing the gap between GIS and rule systems. The main contributions presented are a detailed study about binary topological relationships; a complete proposal for the problem of maintaining these relationships; and the definition of algorithms to verify the topological integrity (these algorithms are incorporated in the prototype).
Incorporation of Spatial-Temporal Facilities in OODB
This dissertation presents a framework to incorporate support for spatial-temporal data in object oriented database management systems. The main contributions are: (i) Description of a spatial-temporal object oriented data model, allowing the representation of spatial-temporal data evolution, common in geographic information systems; (ii) Definition of data structures in a object oriented database to support the model, storing spatial data in the vector format. This structures make possible to store the temporal evolution of the objects, which encapsulate access methods to their temporal states; (iii) Specification of a taxonomy of spatial-temporal queries in geographic information systems. This proposal extends other GIS models, bringing the possibility to incorporate new facilities in future systems.
Benchmarks for Geographic Information Systems
Geographical Information Systems (GIS) deal with data that are special in nature and size. Thus, the technologies developed for conventional data base systems such as access methods, query optimizers and languages, have to be modified in order to satisfy the needs of a GIS. These modifications, embedded in several GIS, or being proposed by research projects, need to be evaluated. This thesis proposes mechanisms for evaluating GIS based on benchmarks. The benchmark is composed of a workload to be submitted to the GIS being analysed and data characterizing the information. The workload is made of a set of primitive transactions that can be combined in order to derive transactions of any degree of complexity. These primitive transactions are oriented to spatial data but not dependent on the way they are represented (vector or raster). The benchmark data base characterization was defined in terms of the types of data required by applications that use georeferencing, and by the need to generate complex and controlled artificial data. The proposed technique and methods were used to show how to create the transactions and the data for a given application.
Heterogeneous Database Integration into Urban Planning Applications
The name Geographic Information Systems (GIS) denotes software that handles georeferenced data - data connected spatially to the earth surface. Modern GIS are based in relational database systems, extended to efficiently support georeferenced applications. Recent studies indicate that the object oriented paradigm is more adequate for this type of system. However, the migration of relational to object oriented systems is costly. This dissertation presents a solution to this problem, which consists in defining mechanisms that allow the integration of the present (relation based) systems and the new (object based) systems, with emphasis in urban applications. The architecture proposed integrates object oriented and relational DBMS designed, respectively, using the OMT and ECR models. In order to allow this integration the dissertation developed primitive operations for mapping between both data models, as well as primitives for converting OMT schemas into schemas of the O2 object oriented DBMS. This proposal was validated through the integration of two real life applications which use the basic elements of urban planning: Telebr s' telephone network management system and Eletropaulo's electrical network management system.
Incorporating the Temporal Dimension in Object Oriented Database Systems
The last two decades have witnessed an intensive research on temporal databases. Although several results have already been achieved for temporal realtional systems, there are few proposals considering the incorporation of the temporal dimension to the development of this area. This dissertation persents the following original results: a board survey on temporal models and languages; the proposal of a new temporal model (TOODM), based on the object oriented paradigm; and the specification of a query language (TOOL) for the proposed model.
Detection of some abrupt transitions in video sequences
A digital video is represented by a sequence of images or frames. A video shot is an uninterrupted segment of screen time, space and graphical configurations. The problem of transition detection between shots can be seen as one of the most important step to the process of segmenting and parsing a digital video. In order to have an automatic detection of these events, some researches consider frame by frame comparison using dissimilarity measures based on color, form and texture information, while others apply images processing techniques over a representative image of the whole video. This work describes a new approach to detect transitions and abrupt effects (cuts and flashes) in image sequences, by considering simple and low computational cost algorithms which are defined based on patterns identification of a 1D signal. The results presented here show the good performance of the method in the identification of the corresponding events. http://libdigi.unicamp.br/document/?did=10165
Publication and Integration of Scientific Workflows on the Web
Scientific activities involve complex multidisciplinary processes and demand cooperative work. This entails a series of open problems in supporting this work ranging from data and process management to appropriate user interfaces for softwares. This work contributes in providing solutions to some of these problems. It focuses on improving the documentation mechanisms of processes and making it possible to publish and integrate them on the Web. This eases the specification and execution of distributed processes on the Web as well as the reuse of these specifications. The work was based on Semantic Web standards aiming at interoperability and the use of scientific workflows for modeling processes and using them on the Web. The main contributions of this work are: (i) a data model, which takes Semantic Web standards into consideration, for representing scientific workflows and storing them in a database. The model induces a workflow specification method that favors reuse and integration of these specifications; (ii) a comparative analysis of standards proposals for representing workflows in XML; (iii) the proposal of a Web-centered architecture for the management of documents (mainly workflows); and, (iv) the partial implementation of this architecture. The work uses as a motivation the area of environmental planning as a means to elucidate requirements and validate the proposal
The Fluid Web and Digital Content Components: from a document-centered to a content-centered view
The Web is evolving from a space for publication/consumption of documents to an environment for collaborative work, where digital content can travel and be replicated, adapted, decomposed, fusioned and transformed. We call this the Fluid Web perspective. This view requires a thorough revision of the typical document-oriented approach that permeates content management on the Web. This thesis presents our solution for the Fluid Web, which allows moving from the document-oriented to a content-oriented perspective, where "content" can be any digital object. The solution is based on two axes: a self-descriptive unit to encapsulate any kind of content artifact - the Digital Content Component (DCC); and a Fluid Web infrastructure that provides management and deployment of DCCs through the Web, and whose goal is to support collaboration on the Web.
Statistical significance tests and evaluation of a model of content image recovering
http://libdigi.unicamp.br/document/?did=11484
A Data Model for Moving Objects
The dissemination of devices like GPS and wireless networks have enabled new applications that collect and analyze mobile objects. Traditional database systems do not support the management of mobile object data, since a great amount of information is continuously generated. Research in this area is recent, with relatively few works on mobile data management. This MSc thesis proposes an object-oriented moving object data model that incorporates two characteristics: it supports modeling of static, spatial, temporal, spatio-temporal and mobile objects in a homogeneous way; it specifies a set of basic operators and their algorithmic specification. These operators can be composed to obtain a wide variety of complex operators to query a mobile object database. Unlike most other proposals, this model supports not only 1D objects, but also those with 2D and 3D geometric descriptions. http://libdigi.unicamp.br/document/?did=13826
Corporate data integration: proposal of an architecture based on data services
The need for data integration in enterprises dates back to several decades. However, it is still a pressing problem for most environments, since it is seen as a means to allow integration among customers, partners and suppliers. Besides needs that arise from fusion of companies, there is always the issue of legacy systems that result from distinct implementations in different technologies. The resulting scenario is a distributed set of files and databases, which are redundant heterogeneous and hard to manage. Data integration requires reliable mechanisms, as well as an integrated set of procedures to ensure consistency, security and control of corporate data. Off the shelf solutions still provide fragmented views of data integration. This work analyzes problems found in enterprises during data integration processes, taking all previously mentioned factors into consideration. It proposes an architecture to solve these problems. The solution combines research in databases, distributed systems, and Web services and systems.
Mechanisms to Speed up Foreign Trade Processes
The dynamism of foreign trade, a consequence of the globalization of the economy, has resulted in considerable growht of goods exchanged all over the world. In order to cope with this demand, the Brazilian government has been continuously investing in the modernization of the installations, equipment and software to offer efficient, safe and faster services to people and enterprises involved in foreign trade. The Integrated System of Foreign Trade - SISCOMEX - was developed as part of this effort. This system, developed by the agencies that control foreign trade, is responsible for controlling and storing all related information regarding Importations and Exportations, as well as Special Customers Trade. The system facilitates the supervisory tasks of the IRS, thus speeding up importation and exportation of goods in ports and airports. The system's DBMS is hosted by a mainframe platform in SERPRO. However, data communication is done via MDB files exchanged with companies which are connected to the system through links. Every company interested in foreign trade needs to interact with SISCOMEX. This interaction is complicated because companies and government do not always have database compatibility. The dissertation covers the problems of integration and communication between companies and SISCOMEX. The main contributions are: the analysis of the interaction among the participating information systems; a proposal to standardize the most common foreign trade processes, according to processing requirements and interface standards. The model is validated through a real life case study and prototype.
GeoMarketing: Models and Systems, with Applications in Telecommunications
The goal of geomarketing is to manage and combine spatial and business data for decision support within the domain of marketing application. This concept is evolving to include other application domains. However, the models proposed are seldom properly implemented; moreover, existing systems are not extensible, and support only one specific simulation model. This thesis contributes to solving these issues. The main contributions are the following: (1) proposal of a conceptual architecture for a geomarketing information systyem, that takes into consideration new methods and technologies; (2) a comparative study of distinct types of spatial marketing models and techniques; (3) identification of some kinds of problems in the telephone/telecommunications domain that can profit from the use of such techniques; (4) validation of the architecture by means of a Web geomarketing prototype, VoroMarketing. This is a modular and extensible prototype, which supports the use of distinct spatial models; (5) configuration of the prototype to solve specific problems in the telecommunications domain.
An Environment for Management of Images and Spatial Data for Development of Biodiversity Applications
There is a wide range of environmental applications requiring sophisticated management of several kinds of data, including spatial data and images of living beings. However, available information systems offer very limited support for managing such data in an integrated manner. On the one hand, environmental applications based on Geographic Information Systems (GIS) allow spatially correlating geophysical data and information on living species. On the other hand, image information systems used by biologists provide management of photos of landscapes and/or animals, but without any kind of geographical referencing. This thesis provides a solution to combine these query requirements, which takes advantage of current digital library technology to manage networked collections of heterogeneous data in an integrated fashion. The research thus contributes to solve problems of specification and implementation of biodiversity information systems that manage images of species, textual descriptions and spatial data in an integrated way, under the digital library perspective. This solution provides biodiversity researchers with new querying options. The main contributions of this thesis are: (i) a generic architecture, based on digital library components, for managing heterogeneous data collections, to access biodiversity data sources (text, images, and spatial data); (ii) a proposal of new shape descriptors for supporting content-based image retrieval; (iii) a new digital library component, for content-based image search; (iv) adoption of distinct visual structures for exploring query results in an image database; and (v) partial validation of the architecture, through implementation of a prototype that uses fish-related data.
Specification of a Public Bid System in eGovernment
The perfectioning, reorganization and use of technology in the process of purchases of a public company, besides improving its internal control, propitiates the reduction of operational costs. This allows increasing the efficiency and transparency in the public bidding systems. This work presents a model of a Purchases Electronic System for public companies. This system uses the Internet as a means to support negotiation, within legal principles. The work also proposes a model to manage the supply process. This models supports integration of the proposed solution with G2B (Government to Business) electronic commerce, thereby contributing to understanding the purchase workflow within public companies in Brazil.
Granulometric analysis of cell nuclei texture: Design of computational tools and application in biological models
The chromatin texture of cell nuclei is a special concern for pathologists, because it can reflect metabolic changes, proliferative activity, nutritional state and cell differentiation. This work intends to enhance knowledge about image texture assessing granulometric features, to find in the literature different methods of residues extraction. Three groups of granulometric methods were implemented, such as: A) Classic granulometry, that operates with structuring elements; B) Granulometry by area closing or volume closing, that builds a component tree; C) Granulometry by H-basins closing, using geodesic reconstruction. The different methods were compared each other, using two image sets from biological models: Model I: Analysis of chromatin changes of cardiomyocytes during hystological development of 89 Wistar rats, between 19 days after conception and 60 days after birth. In order to obtain hematoxylin-stained cytologic preparations, the formalin-fixed samples were KOH-hydrolysed for 18 hours. Number of mitosis and granulometric parameters had a similar Spearman correlation coefficient with age, around -0.77. The chromatin texture becomes smoother with ageing, reflecting the progressive cell diferentiation of cardiomyocytes. Model II: Detection of chromatin texture differences between nuclei of three lung neoplasias and normal tracheobronchic cells. Cytologic brush smears, hematoxylin-eosin stained, collected during bronchoscopic exams of 117 patients, divided in 4 groups, were compared each other. The granulometry which uses structuring elements classified correctly 68.4% of cases. Among the residues-extraction techniques, the extraction of H-residues by geodesic reconstruction showed more significative results in the first biological model. Classic granulometric features had better performance to classify the image groups in the biologic model II. Residues-extraction by area opening or volume opening seemed to be fast and efficient methods. The granulometric residues are capable to provide useful information from chromatin texture, as demonstrated in nuclear changes during development of myocardium and between human lung cancers.
A Proposal for the Database of the WebMaps Project
The goal of the WebMaps project is the specification and development of a WEB information system to support crop planning and monitoring in Brazil. This kind of project involves state-of-the art research all over the world. One of the problems faced by WebMaps is database design. This work attacks this issue, discussing the project´s needs and proposing a basic database that supports management of users, properties and parcels, as well as other kinds of data, especially satellite images. The main contributions of this work are: specification of a spatio-temporal database model; specification of sets of temporal, spatial and spatio-temporal queries; and the implementation of a prototype, in Postgresql/Postgis.
A Methodology to Integrate Legacy Systems and Heterogeneous Databases
Applications increasingly need to access different data sources to get information. Many of these sources are managed by legacy systems, and need to be integrated or migrated to become more flexible and manageable. This work proposes a methodology to help the integration of these heterogeneous data sources and which takes legacy data into account, considering the features of each system. The methodology takes into account several factors that help the choice of the better solution to be applied on each case, and an algorithm to design the federated system and to process queries using this system. The proposed methodology was validated by a case study on databases and legacy systems for a municipal administration system for the city of Paulinia, SP.
WOODSS - Spatial Decision Support System based on Workflows
Environmental planning takes nowadays advantage of Geographic Information Systems (GIS) to manage geo-spatial data. Nevertheless, GIS do not provide facilities to reuse users' expertise in solving problems. This dissertation provides a solution to this limitation, specifying and implementing a Spatial Decision Support System. The user interactions with GIS are intercepted by WOODSS, which documents them as scientific workflows. These workflows can be edited and re-executed directly in the GIS. WOODSS thus allows documenting and repeating planning activities, as well as creating new planning strategies. It was implemented on top of the IDRISI software, and tested in the context of agro-environmental planning activities.
Use of Urban Geographic Data in the Comparison of Spatial Access Methods
This dissertation presents a performance analysis of spatial access methods based on a real life database. In spite of the large amount of research dealing with the performance comparison of spatial access methods, very little has been done when it comes to considering the properties of specific groups of applications. In part, this is due to the difficulty in obtaining real data sets to represent these applications. The use of real data is necessary, since synthetic data generation may result in data sets with atypical characteristics, leading in turn to conclusions that may not be generally applied. In this context, the main contributions of this work are: - the conversion of a real data set that is representative of geographic applica tions for public utility services management to a format in which it may be easily delivered to other researchers. Public utility services include telecommunica tion, electricity and water supply, and the like. - the performance comparison of a group of spatial access methods of the R-tree family with regard to the indexing of these data.
A Spatio-Temporal Database for Development of Applications in Geographic Information Systems
Abstract This dissertation discusses the implementation of an extensible framework, which provides support for the development of spatio-temporal database applications. The infrastructure, developed on the O2 object-oriented database system, consists of a kernel set of operators and database classes, which meet the minimum requirements for the processing of spatial, temporal and spatio-temporal queries. The main contributions of this work are the specification of the kernel operators and classes and their implementation, validated through a pilot geographic application. Another contribution is the analysis of this implementation, which discusses problems and shortcomings of some models proposed in the literature.
Design and Implementation of a Metadata Database for the Biodiversity Information System of the State of Sao Paulo
This dissertation presents the design and implementation of the metadata database of the information system for the BIOTA/FAPESP program. This is a long term scientific program that aims the establishment of a common basis for cooperation among different researchers on biodiversity and the dissemination of their work, to give subsidies to the creation of environmental preservation programs in the State of São Paulo. The metadata database is the system component responsible for the high level description of several biodiversity data collected by researchers. This dissertation discusses different aspects of the development of this database, situating it in the context of a biodiversity information system. The main contributions presented are: a) survey of several proposals for metadata standards, for environmental data; b) proposal of a metadata standard for the biodiversity information system that encompasses other proposals and extends them in order to consider environmental aspects; c) design of the metadata database; and d) implementation of a prototype of the information system, with emphasis on its metadata aspects.
Generation and Indexing of Spatio-Temporal Data
The goal of the dissertation is the design, implementation and evaluation of an access structure for spatiotemporal data. The dissertation is a collection of four papers written in English, with an introduction and a conclusion written in Portuguese. The first paper presents a survey of spatial data indices and traditional data persistent indices. In addition, the paper describes a novel structure, the HR-tree, as well as its algorithms to insert, delete, update and search data. The second paper addresses the development of an algorithm to generate spatiotemporal data, called GSTD (Generate Spatiotemporal Data). The algorithm allows the generation of spatiotemporal data following A few statistical distributions for some user defined parameters, that control, for example, the initial spatial location, the dinamicity of updates (in time) and the spatial data movements. The third paper presents a comparison of the HR-tree to two other structures. The first one is a 3D spatial structure, based on the R-tree, that treats time as another dimension. In that structure, the initial and end time of the objects have to be known beforehand. The second one is basically a structure that combines two spatial structures, also based on the R-tree: a 2D structure that indexes current objects (i.e., objects with an end time unknown) and a 3D structure that indexes objects alread closed (i.e., objects with initial and end time known). The fourth and last paper describes an application of the HR-tree in another problem domain, namely bitemporal data indexing. The overall conclusion of this work is that the HR-tree has the best performance (when compared to the other two structures) to answer spatial queries in a specific point in time and for small time intervals, but the HR-tree is much bigger than the other two structures. However, nowadays space requirements are not as problematic as response time, hence, we believe the HR-tree is a good access structure for spatiotemporal data.
Query Processing in the BIOTA Biodiversity Database
SINBIOTASP is the biodiversity information system being developed as part of the BIOTA/FAPESP program. This thesis is focused in the implementation issues of the query processing of the SINBIOTASP system. This subject presents many challenges in the formulation and the processing of the queries, due to the variety and the volume of the data and to the wide range of system user profiles. The main contributions of this work are: a survey of the query processing features of many environmental information systems on the Web; a systematization of the query types which are typical of biodiversity application, considering processing and interface criteria; and specification of a basic set of spatial operators, as well general query interfaces, involving maps and textual data, in the context of biodiversity environmental information systems. As a final contribution, this analysis was validated by the development of the Species Mapper module of SINBIOTASP, which allows Web query processing on collection and distribution of species.
Factors that Affect the Performance of Spatial Join Methods: a Study Based on Real Data
Synchronized tree traversal join methods for spatial access methods were analysed. The factors considered included bufferpool size, page size, intermediate join indexes ordering criteria, bufferpool page replacement policies, among others. This analysis was based on real data taken from a GIS application for telecommunications, indexed on a R*-tree. Results of this work assess the way those factors affect spatial join performance and can be used for tuning such methods.
Data Quality in Geographic Applications
One of the main goals of GIS is to help decision makers in carrying out their tasks for situations where the spatial dimension is relevant -- e.g., in urban or environmental planning activities. The quality of the decisions, however, is intimately dependent on the quality of the geographical data used. This is usually ignored by decision makers, who limit themselves to relying on the correct operation of the equipment used to collect data or on the GIS where the applications are developed. The goal of this dissertation is to fill this gap, by presenting an analysis of the theme data quality in the context of geographic applications. This analysis ranges from the stage of data capture to the presentation of the result of the applications and the interpretation taken by the user for decision making. Besides an extensive bibliographic survey, other contributions of this work include the suggestion of a basic group of criteria to evaluate this quality, and an analysis of how these criteria can be met. Finally, part of these suggestions were implemented in a tool coupled to a GIS, which allows users to visualize data quality information.
Combining Databases and Case Based Reasoning for Decision Support in Environmental Planning
Environmental planning takes advantage of Spatial Decision Support Systems (SDSS) for problem solving. These softwares supply integrated frameworks which permit users to deal with data and models in analysis and simulation tasks. However, they usually provide generic models which need to be specialized to fit particular situations. Since this process requires considerable effort and expertise, it is crucial to allow planners to profit from others' experience. The goal of this dissertation is to develop mechanisms to help environmental planners to solve problems incrementally. The solution presented here consists in coupling Case-based Reasoning (CBR) to the WOODSS spatial decision support system (WOrkflOw-based spatial Decision Support System), developed at the LIS laboratory at the Institute of Computing, UNICAMP. WOODSS interacts with a Geographical Information System and provides model handling facilities, documenting them by means of scientific workflows. The focus of this work is on specifying and implementing new model storing and retrieval modules for WOODSS, using CBR techniques. The main contributions of this research are: (a) requirement eliciting for using CBR in environmental decision support; (b) development of model management algorithms founded on CBR; and (c) extension of the WOODSS system, making it more suitable for problem solving from precedent cases.
Metadata for Scientific Workflows to support Environmental Planning
Environmental Planning Activities have received great attention in the last years, in response to factors that include the acceleration in population growth and consequent necessity of rational exploration of natural resources. The problems in this domain are complex and the goals may often conflict, demanding cooperation of many kinds of experts from several application domains. The WOODSS (Workflow-based Spatial Decision Support System) system, developed in UNICAMP's Institute of Computing, provides support to environmental planning activities, documenting them by scientific workflows, stored in a database. The focus of this dissertation is the specification of efficient means for managing these workflows. The solution is based on the use of metadata specific to scientific workflows. This solution allows flexible access to environmental plans, using distinct sets of parameters, thereby helping communication among the experts involved, as well as plan maintenance, reuse and evolution. The main contributions of this dissertation are: (1) survey of requirements for documenting environmental planning activities; (2) Proposal of a metadata standard for workflows which document Environmental Planning Activities; (3) Specification of mechanisms to couple this standard to WOODSS; and (4) Partial implementation of the proposal, geared towards system extensibility.
An Architecture based on Predicate Generation for derivation of Spatial Association Rules
This thesis proposes and develops models and techniques for the obtention of spa tial association rules. This is based on a two-step process. In the first stage, the geographic database is preprocessed using a knowledge base specified by an expert user to indicate the relationships of interest. This produces a file where data are organized in terms of conventional and spatial predicates. This file can next be processed by standard data mining algorithm s. This simplifies the process of deriving spatial rules to a classical problem of applying traditional association rule mining algorithms. The first step uses two proposed models. The first is the Model of Relational Derivation, whose goal is to identify conventional predicates based on the analysis of descriptive attributes. The second is the Model of Spatial Derivation, responsible for checking spatial relationships among objects and generating spatial predicates, to be subsequently used to derive spatial association rules. A subsequent denormalization algorithm combines conventional and spatial predicates into a single file, used to mine association rules. The main contributions of this work are (i) the specification and validation of a model to derive spatial predicates, (ii) the creation of an architecture that allows obtaining spatial association rules using standard relational mining algorithms (iii) the use of a knowledge base to obtain predicates which are relevant to the user and (iv) the implementation of a prototype.
The POESIA Approach for the Integration of Data and Services in the Semantic Web
POESIA (Processes for Open-Ended Systems for Information Analysis), the approach proposed in this work, supports the construction of complex processes that involve the integration and analysis of data from several sources, particularly in scientific applications. This approach is centered in two types of semantic Web mechanisms: scientific workflows, to specify and compose Web services; and domain ontologies, to enable semantic interoperability and management of data and processes. The main contributions of this thesis are: (i) a theoretical framework to describe, discover and compose data and services on the Web, including rules to check the semantic consistency of resource compositions; (ii) ontology-based methods to help data integration and estimate data provenance in cooperative processes on the Web; (iii) partial implementation and validation of the proposal, in a real application for the domain of agricultural planning, analyzing the benefits and scalability problems of the current semantic Web technology, when faced with large volumes of data.
Access Control in Geographic Databases
The access control problem in databases consists in determining when (and if) users or applications (WHO) can access stored data (WHAT), and what kind of access (HOW) they are allowed. Most of the research in this area is geared towards management of relational data, for commercial applications. The objective of this thesis is to study this problem for geographic databases, where constraints imposed on access control management must consider the spatial location context. The main contributions of this work are: (a) overview of requirement analysis for access control in geographic databases; (b) definition of an authorization model based in spatial characterization; (c) discussion of the implementation aspects of this model (d) analysis of how this proposal can be adopted by a large scale telecommunications AM/FM spatial application, the SAGRE System. Sagre is an outside plant management geographic information system, developed at CPqD foundation, in use in most telephone operator service providers in Brazil.
3D GIS Application Interfaces considered as Communication Spaces
A Geographical Information System (GIS) is a system that deals with manipulation, administration and visualization of geo-referenced data. The term geo-referenced denotes data that possess representation in a system of geographical coordinates. A GIS allows the creation of applications for specific domains, such as urban and environmental planning. An application involves data, algorithms, functions and visualization (application interface). There are two GIS interface categories: 2DGIS and 3DGIS. In our work, we are particularly interested in the latter. 2DGIS are restricted to the 2D representation of space. 3DGIS allow the creation of interfaces for applications that raise the geographical visualization to a higher level of visual reality. Visual reality, in this context, refers to the vision that a human being has of the real world. Despite their facilities for manipulation of geographical data, GIS presuppose application designers have specific knowledge of all aspects of the technology of the system, thus restricting its use only to people involved in that domain. There is a series of conceptual problems that create a gap between a GIS and the reality noticed by application designers. These problems start with the design of the interface of those tools. This hampers the development process of 3D interfaces for GIS applications. The objective of this work is the study and evaluation of the modelling of 3D interfaces for GIS applications. A case study on ArcView GIS 3D Analyst illustrates this study. As a way of dealing with the problem, we propose the use of a specific semiotics-based methodology denominated Communication Space, for modelling 3D GIS application interfaces. Semiotics allows dealing with application interface entities as if they were elements which communicate a meaning, enabling the designer to capture inconsistencies that are important in the (re)design of the 3D interface. The adopted methodology served as a basis to develop an interface layer on ArcView GIS 3D Analyst, denominated EComSIG. The objective of EComSIG is to hide the inherent complexity of the modelling of 3D interfaces for GIS applications and, at the same time, to systematize the process of designing 3D interfaces for such applications. The contributions of the work are of two natures: (i) theoretical: providing the study of interface aspects for Geographical Information Systems, considering its semiotic aspects and (ii) applied: the development of a prototype to evaluate the relevance of the proposed solution.
Use and Application of Economic Models in Geomarketing Information Systems
Survival in the business world depends on knowledge of one's clients and competitors. A crucial factor in this competition is the ability to manage business data within a geographic context. The search for efficiency in decision making motivated the emergence of geomarketing, which combines marketing policies and strategies to information systems and the geographic location of the resources manipulated. This work aims to fill a gap in this nascent area, by combining results in economic models, computer science and geoprocessing. The main contributions of this dissertation are: a) a survey of the theoretical basis underlying information systems applied to geomarketing, both in economic modeling and computer science aspects; b) analysis of software engineering methodologies applied to the development of geomarketing applications; and c) implementation of a real life case study in geomarketing, adapting a specific economic model, and coupling it to a geographic information system.
Database-centered Documentation of Environmental Planning Activities
The environmental planning process is a complex task that covers various aspects, involving a series of steps and is fed by many data sources. Normally, this process demands the cooperation of multidisciplinary teams that discuss many planning alternatives. These alternatives consider, for instance, multiple issues on preservation or recovery of environmental resources. One of the main problems in this process is the lack of associated documentation. As in any cooperative activity, documentation is important for revision, maintenance and evolution of the plan, and for communication among designers. The goal of this dissertation is to partially solve the documentation problem, through the specification and partial implementation of an environment to manage, in a unified way, three kinds of documents. These documents, generated during environmental planning activities, are of three kinds: description of the final product - the plan (WHAT documents), description of the process used to obtain the final product (HOW documents) and description of the reasons behind the decisions of planning (WHY documents). These documents were specified so as to allow them to be stored and managed in a database. WHAT documents are represented through hypermedia structures, HOW documents using scientific workflows, and WHY documents are based in design rationale structures. The main contributions of this research are: (a) database-centered specification and design of the WHY, HOW and WHAT documents; (b) specification of an environment to support management of these documents, thus fostering cooperative work in environmental planning; (c) partial implementation of this environment.
An architecture for querying biodiversity repositories on the Web
Life on Earth forms a broad and complex network of interactions, which some experts estimate to be composed of up to 80 million different species. Tackling biodiversity is essentially a distributed effort. A research institution, no matter how big, can only deal with a small fraction of this variety. Therefore, to carry ecologically-relevant biodiversity research, one must collect chunks of information on species and their habitats from a large number of institutions and correlate them using geographic, biologic and ecological knowledge. Distribution and heterogeneity inherent to biodiversity data pose several challenges, such as how to find relevant information on the Web, how to solve syntactic and semantic heterogeneity, and how to process a variety of ecological and spatial predicates. This dissertation presents an architecture that exploits advances in data interoperability and semantic Web technologies to meet these challenges. The solution relies on ontologies and annotated repositories to support data sharing, discovery and collaborative biodiversity research. A prototype using real data has implemented part of the architecture.
Dynamic Constraints in Active Object-Oriented Databases
This dissertation addresses the problem of modeling and enforcing general integrity constraints in database systems. The solution is based on the use of active object-oriented database management systems (DBMS) that provide support to rule mechanisms. The work proposes a strategy to be applied during application design. This strategy takes into consideration the behavior and active features of the DBMS. The strategy's goal is to represent the constraints into conceptual design using CDL - a declarative and model independent language and to provide mappings in terms of production rules responsible for constraint enforcement. The main contributions presented are: the proposal of a taxonomy for integrity constraints in modeling information systems; the specification of the CDL constraint language; general heuristics for mapping constraints expressed in CDL into production rules in the active database; and the specification of the characteristics needed from active database in order to support general integrity constraints in information systems. This dissertation extends previous proposal found in the literature, providing support to model dynamic constraints in database system design using active object-oriented DBMS.
A Graphical Tool for Browse and Query in Object-Oriented Databases
This dissertation analyses the problems involved in the design and implementation of graphical interfaces for object-oriented database management systems (DBMSs). As a result of this analysis, the dissertation presents directives for the development of database system interfaces. The practical application of these directives was illustrated through the specification and implementation of GOODIES -- a new interface system, which allows browsing and querying DBMSs that support the basic features of the OO model. The design and implementation of this new system is described as a case study of the use of the proposed directives. The system development process was purposely conducted independent from any specific DBMS. Thus, it can be used on top of several OO database systems.
 

Powered by Plone CMS, the Open Source Content Management System

This site conforms to the following standards: