subject

As some of you know well, and others of you may be interested to learn, a number of languages (including chinese and japanese) are written without spaces between the words. consequently, software that works with text written in these languages must address the word segmentation problem--inferring li_kely boundaries between consecutive words in the exercises text. if english were written without spaces, the analogous problem would consist of taking a string like "meetateight" and deciding that the best segmentation is "meet at eight" (and not "me et at eight," or "meet ate ight," or any of a huge number of even less plausible alternatives). how could we automate this process? a simple approach that is at least reasonably effective is to find a segmentation that simply maximizes the cumulative "quality" of its individual constituent words. thus, suppose you axe given a black box that, for any string of letters x = xk, will return a number quality(x). this number can be either positive or negative; larger numbers correspond to more plausible english words. (so quaiity("rne") would be positive, while quality("ght") would be negative.) given a long string of letters y = yly2 ¯" "yn, a segmentation of y is a partition of its letters into contiguous blocks of letters; each block corresponds to a word in the segmentation. the total quality of a segmentation is determined by adding up the qualities of each of its blocks. (so we’d get the right answer above provided that quaiity("rneet") + quality("at") + quality(" eight") was greater than the total quality of any other segmentation of the string.) give an efficient algorithm that takes a string y and computes a segmentation of maximum total quality. (you can treat a single call to the black box computing quality(x) as a single computational step.) (a final note, not necessary for solving the problem: to achieve better performance, word segmentation software in practice works with a more complex formulation of the problem--for example, incorporating the notion that solutions should not only be reasonable at the word level, but also form coherent phrases and sentences. if we consider the example "theyouthevent," there are at least three valid ways to segment this into common english words, but one constitutes a much more coherent phrase than the other two. if we think of this in the terminology of formal languages, this broader problem is like searching for a segmentation that also can be parsed well according to a grammar for the underlying language. but even with these additional criteria and constraints, dynamic programming approaches lie at the heart of a number of successful segmentation systems.)

ansver
Answers: 2

Another question on Computers and Technology

question
Computers and Technology, 22.06.2019 23:30
Creating "smart interfaces" in all sectors of industry, government, and the public arena is one of the fastest growing hct areas. these interfaces model, interpret, and analyze such human characteristics as speech, gesture, and vision. the field of biometrics, in which humans authenticate themselves to machines, is an area of considerable interest to hct practitioners. fingerprint scans are one of the most frequently used biometric options, and this article, biometric student identification: practical solutions for accountability & security in schools, makes a case for the implementation of fingerprint scans in schools. critique the article, and answer the following questions: according to the author, what are the main benefits of adopting fingerprint scans in schools for student identification? according to the author, what are the main drawbacks of adopting fingerprint scans in schools for student identification? do you agree with the author's assessment of the pl
Answers: 2
question
Computers and Technology, 23.06.2019 00:40
Consider the following statements: struct nametype{string first; string last; }; struct coursetype{string name; int callnum; int credits; char grade; }; struct studenttype{nametype name; double gpa; coursetype course; }; studenttype student; studenttype classlist[100]; coursetype course; nametype name; mark the following statements as valid or invalid. if a statement is invalid, explain why.a.) student.course.callnum = "csc230"; b.) cin > > student.name; c.) classlist[0] = name; d.) classlist[1].gpa = 3.45; e.) name = classlist[15].name; f.) student.name = name; g.) cout < < classlist[10] < < endl; h.) for (int j = 0; j < 100; j++)classlist[j].name = name; i.) classlist.course.credits = 3; j.) course = studenttype.course;
Answers: 1
question
Computers and Technology, 23.06.2019 04:00
Write a method that takes in an array of point2d objects, and then analyzes the dataset to find points that are close together. be sure to review the point2d api. in your method, if the distance between any pair of points is less than 10, display the distance and the (x,y)s of each point. for example, "the distance between (3,5) and (8,9) is 6.40312." the complete api for the point2d adt may be viewed at ~pf/sedgewick-wayne/algs4/documentation/point2d.html (links to an external site.)links to an external site.. try to write your program directly from the api - do not review the adt's source code.
Answers: 1
question
Computers and Technology, 24.06.2019 07:30
Aproject involves many computing systems working together on disjointed task towards a single goal what form of computing would the project be using
Answers: 3
You know the right answer?
As some of you know well, and others of you may be interested to learn, a number of languages (inclu...
Questions
question
Mathematics, 24.10.2020 23:00
question
Mathematics, 24.10.2020 23:00
question
History, 24.10.2020 23:00
question
Mathematics, 24.10.2020 23:00
question
English, 24.10.2020 23:00
question
Chemistry, 24.10.2020 23:00
Questions on the website: 13722367