O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu

Publicações por José Luís Borges


Semantically Enriched Variable Length Markov Chain Model for Analysis of User Web Navigation Sessions

Shirgave, S; Kulkarni, P; Borges, J;


The rapid growth of the World Wide Web has resulted in intricate Web sites, demanding enhanced user skills to find the required information and more sophisticated tools that are able to generate apt recommendations. Markov Chains have been widely used to generate next-page recommendations; however, accuracy of such models is limited. Herein, we propose the novel Semantic Variable Length Markov Chain Model (SVLMC) that combines the fields of Web Usage Mining and Semantic Web by enriching the Markov transition probability matrix with rich semantic information extracted from Web pages. We show that the method is able to enhance the prediction accuracy relatively to usage-based higher order Markov models and to semantic higher order Markov models based on ontology of concepts. In addition, the proposed model is able to handle the problem of ambiguous predictions. An extensive experimental evaluation was conducted on two real-world data sets and on one partially generated data set. The results show that the proposed model is able to achieve 15-20% better accuracy than the usage-based Markov model, 8-15% better than the semantic ontology Markov model and 7-12% better than semantic-pruned Selective Markov Model. In summary, the SVLMC is the first work proposing the integration of a rich set of detailed semantic information into higher order Web usage Markov models and experimental results reveal that the inclusion of detailed semantic data enhances the prediction ability of Markov models.


Risk-Taking Propensity and Entrepreneurship: The Role of Power Distance

Antoncic, JA; Antoncic, B; Gantar, M; Hisrich, RD; Marks, LJ; Bachkirov, AA; Li, ZY; Polzin, P; Borges, JL; Coelho, A; Kakkonen, ML;


The personal characteristics of entrepreneurs can be importantly related to entrepreneurial startup intentions and behaviors. A country-moderated hypothesis including the relationship between an individual's risk-taking propensity and entrepreneurship (behaviors or intentions of the person) was conceptually developed and empirically tested in this study. The data collection was performed through a structured questionnaire. Multinominal logistic regression was used for analyzing data obtained from 1,414 students in six countries. The crucial contribution of this research is the clarification of the character of risk-taking propensity in entrepreneurship and the indication that the risk-taking propensity-entrepreneurship relationship can be moderated contingent on power distance.


Utilização do sucesso acadêmico para prever o abandono escolar de estudantes do ensino superior: um caso de estudo

Sousa, ACCd; Oliveira, CABd; Borges, JLCM;

Educação e Pesquisa

Resumo O abandono escolar é um problema complexo que afeta a maioria dos programas de graduação pós-secundária, em todo o mundo. O curso de engenharia industrial do Instituto ISVOUGA, localizado em Santa Maria da Feira, Portugal, não é exceção. Este estudo usou um conjunto de dados contendo informações gerais dos estudantes e suas notas para as unidades curriculares já avaliadas. A partir deste conjunto de dados, foram selecionados dezessete preditores potenciais: cinco intrínsecos (gênero, estado civil, situação profissional, idade e regime de dedicação aos estudos – integral ou parcial) e doze extrínsecos (as notas em todas as doze unidades curriculares ministradas durante os dois primeiros semestres do curso). O objetivo principal desta investigação foi prever a probabilidade de um estudante abandonar o curso com base nos referidos preditores. Foi usada uma regressão logística binária para classificar os estudantes como tendo uma probabilidade alta ou baixa de não se reinscreverem no curso. Para validar se a metodologia utilizada é apropriada para o estudo em causa, a precisão obtida com o modelo de regressão logística foi comparada, por via de uma validação cruzada com cinco partições, com a precisão obtida pela utilização de três métodos muito utilizados em data mining: One R, K Nearest Neighbors e Naive Bayes. O modelo de regressão logística identificou quatro variáveis significativas na previsão do abandono escolar (as classificações nas unidades curriculares de ciência dos materiais, eletricidade, cálculo 1 e química). Os dois preditores mais influentes do abandono dos estudantes são não conseguir aprovação nas unidades curriculares menos exigentes: ciência dos materiais e eletricidade. Ao contrário do que seria de supor antes desta investigação, descobrimos que a não aprovação em unidades curriculares mais exigentes, como física ou estatística, não tem influência significativa no abandono escolar.



Real, AC; Borges, J; Oliveira, CB;


Air temperature data from many locations worldwide are only available as series of daily minima and maxima temperatures. Historically, several different approaches have been used to estimate the actual daily mean temperature, as only in the last two or three decades automatic thermometers are able to compute its actual value. The most common approach is to estimate it by averaging the daily minima and maxima. When only daily minima and maxima are available, an alternative approach, proposed by Dall'Amico and Hornsteiner in 2006, uses the two daily extremes together with next day minima temperature and a coefficient related to the local daily astronomical sunset time. Additionally, the method uses two optimizable coefficients related to the region's temperature profile. In order to use this approach it is necessary to optimize the region's unknown parameters. For this optimization, it is necessary a dataset containing the maxima, minima, and the actual daily mean temperatures for at least one year. In this research, for the period 2007-2014, we used three datasets of minima, maxima and actual mean temperatures obtained at three automatic meteorological stations located in the Douro Valley to optimize the two unknown parameters in the Dall'Amico and Hornsteiner approach. Moreover, we compared the actual mean daily temperatures available from the three datasets with the correspondent values estimated by using i) the usual approach of averaging the daily maxima and minima temperatures and ii) the Dall'Amico and Hornsteiner approach. Results show that the former approach overestimates, on average, the daily mean temperatures by 0.5 degrees C. The Dall'Amico and Hornsteiner approach showed to be a better approximation of mean temperatures for the three meteorological stations used in this research, being unbiased relative to the actual mean values of daily temperatures. In conclusion, this research confirms that the Dall'Amico and Hornsteiner is a better approach to estimate the mean daily temperatures and provides the optimized parameters for three sites located at each of the three sub-regions of the Douro Valley (Baixo Corgo, Cima Corgo and Douro Superior).


Visualization of Urban Mobility Data from Intelligent Transportation Systems

Sobral, T; Galvao, T; Borges, J;


Intelligent Transportation Systems are an important enabler for the smart cities paradigm. Currently, such systems generate massive amounts of granular data that can be analyzed to better understand people's dynamics. To address the multivariate nature of spatiotemporal urban mobility data, researchers and practitioners have developed an extensive body of research and interactive visualization tools. Data visualization provides multiple perspectives on data and supports the analytical tasks of domain experts. This article surveys related studies to analyze which topics of urban mobility were addressed and their related phenomena, and to identify the adopted visualization techniques and sensors data types. We highlight research opportunities based on our findings.


Prediction of Journey Destination for Travelers of Urban Public Transport: A Comparison Model Study

Costa, V; Fontes, T; Borges, JL; Dias, TG;

Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST

In public transport, smart card-based ticketing system allows to redesign the UPT network, by providing customized transport services, or incentivize travelers to change specific patterns. However, in open systems, to develop personalized connections the journey destination must be known before the end of the travel. Thus, to obtain that knowledge, in this study three models (Top-K, NB, and J48) were applied using different groups of travelers of an urban public transport network located in a medium-sized European metropolitan area (Porto, Portugal). Typical travelers were selected from the segmentation of transportation card signatures, and groups were defined based on the traveler age or economic conditions. The results show that is possible to predict the journey’s destination based on the past with an accuracy rate that varies, on average, from 20% in the worst scenarios to 65% in the best. © 2019, ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering.

  • 4
  • 9