PDF] Arabic to French Sentence Alignment: Exploration of A Cross
Por um escritor misterioso
Last updated 10 novembro 2024
A new approach to aligning sentences from a parallel corpus based on a cross-language information retrieval system is presented and it is shown that alignment has correct precision and recall even when the corpus is not completely parallel. Sentence alignment consists in estimating which sentence or sentences in the source language correspond with which sentence or sentences in a target language. We present in this paper a new approach to aligning sentences from a parallel corpus based on a cross-language information retrieval system. This approach consists in building a database of sentences of the target text and considering each sentence of the source text as a "query" to that database. The cross-language information retrieval system is a weighted Boolean search engine based on a deep linguistic analysis of the query and the documents to be indexed. This system is composed of a multilingual linguistic analyzer, a statistical analyzer, a reformulator, a comparator and a search engine. The multilingual linguistic analyzer includes a morphological analyzer, a part-of-speech tagger and a syntactic analyzer. The linguistic analyzer processes both documents to be indexed and queries to produce a set of normalized lemmas, a set of named entities and a set of nominal compounds with their morpho-syntactic tags. The statistical analyzer computes for documents to be indexed concept weights based on concept database frequencies. The comparator computes intersections between queries and documents and provides a relevance weight for each intersection. Before this comparison, the reformulator expands queries during the search. The expansion is used to infer from the original query words other words expressing the same concepts. The search engine retrieves the ranked, relevant documents from the indexes according to the corresponding reformulated query and then merges the results obtained for each language, taking into account the original words of the query and their weights in order to score the documents. The sentence aligner has been evaluated on the MD corpus of the ARCADE II project which is composed of news articles from the French newspaper "Le Monde Diplomatique". The part of the corpus used in evaluation consists of the same subset of sentences in Arabic and French. Arabic sentences are aligned to their French counterparts. Results showed that alignment has correct precision and recall even when the corpus is not completely parallel (changes in sentence order or missing sentences).
How to Remove Page Numbers in Word Quickly
Culture's Recent Consequences: Using Dimension Scores in Theory and Research - Geert Hofstede, 2001
Intrusive growth of initials does not affect cambial circumference in Robinia pseudoacacia
Transition Metal Product Market in 2031: Exploring Growth Avenues with Top Key Players
Frontiers Dimensional models of personality disorders: Challenges and opportunities
Digital Cross-Border Remittance Market: Evaluating Key Drivers and Challenges for Growth
Chapter 2 Basics of Space and Time in: Space and Time in aṣ-Ṣāniʿ Arabic
Applied Sciences, Free Full-Text
Languages, Free Full-Text
Recomendado para você
-
Use Crosscheck In A Sentence10 novembro 2024
-
For #6, would eliminating since the hypothesis change the10 novembro 2024
-
Total number of sentences and number of check-worthy ones in the10 novembro 2024
-
Report Printed Writing Track – Gather 'Round Homeschool USA10 novembro 2024
-
Bioengineering, Free Full-Text10 novembro 2024
-
10 Tautology Examples (2023)10 novembro 2024
-
10 Best Free Plagiarism Checkers in 202310 novembro 2024
-
Weston Favell - Expectations10 novembro 2024
-
How to effectively cross-promote apps - The PickFu blog10 novembro 2024
-
Special Education: What To Do in 30 Minute Groups10 novembro 2024
você pode gostar
-
147 New York Claire Forlani Stock Photos, High-Res Pictures, and Images - Getty Images10 novembro 2024
-
Batman Arkham Origins Xbox 360 Dublado em Português 2 discos10 novembro 2024
-
Stream Pyramid Head music Listen to songs, albums, playlists for free on SoundCloud10 novembro 2024
-
Roblox - Block Tycoon Codes (dezembro de 2023) - Listas Steam10 novembro 2024
-
Códigos do Grand Theft Auto San Andreas PC (portugues)10 novembro 2024
-
Configurações, mira e configuração do huNter CS210 novembro 2024
-
Roblox Woman Face Mug10 novembro 2024
-
😈Chicken gun😎 - Mod menu 2.8.06💪10 novembro 2024
-
Quando surgiu o Xadrez? - Curiosidades - Colégio Web10 novembro 2024
-
Kakashi hatake cosplay - cosplay made by suki cosplay10 novembro 2024