Poster:
|
andrewbontrager |
Date:
|
Jun 7, 2012 11:01am |
Forum:
|
texts
|
Subject:
|
Re: problem viewing full text |
Precisely. I finally got the text to where it doesn't run off the screen like most of my items, but now the text doesn't show at all from the full text link. Hope the administrators can fix it, glad it's nothing on my computer. All my new additions to
archive.org/details/uspatentssingledocuments
are coming up with the glitch too. Thanks guys.
Poster:
|
stbalbach |
Date:
|
Jun 7, 2012 1:15pm |
Forum:
|
texts
|
Subject:
|
Re: problem viewing full text |
Andrew, if all the uploads have the same problem it may be something in the text itself. Perhaps embedded codes, or unix vs dos line breaks, something like that. The streaming mode (Full Text) could be trying to display it as a single long line without line breaks, but exceeding a buffer so doesn't display. Do you have access to a unix box you could try running it through unix2dos (or dos2unix) to convert line breaks? These are just guesses, an admin will have to take a look probably.
Poster:
|
andrewbontrager |
Date:
|
Jun 8, 2012 7:40am |
Forum:
|
texts
|
Subject:
|
Re: problem viewing full text |
As far as I know, there aren't any imbedded codes; however, I edit these documents on a nonstandard PDA and that's why the files come out in such long lines.
Poster:
|
andrewbontrager |
Date:
|
Jun 8, 2012 12:49pm |
Forum:
|
texts
|
Subject:
|
Re: problem viewing full text |
Thanks for the tip, I tried sending a txt file to the site and came up with a 500 server error, shoot.
Poster:
|
stbalbach |
Date:
|
Jun 8, 2012 2:30pm |
Forum:
|
texts
|
Subject:
|
Re: problem viewing full text |
Ok sorry site appears broken. How about converting to another format using this site:
http://document.online-convert.com/Try any number of formats, such as text, html, pdf, doc -- then upload and see how Internet Archive displays.
Poster:
|
andrewbontrager |
Date:
|
Jun 8, 2012 8:37pm |
Forum:
|
texts
|
Subject:
|
Re: problem viewing full text |
Good idea, but I want to keep with the txt format.
Poster:
|
aibek |
Date:
|
Nov 7, 2012 11:35pm |
Forum:
|
texts
|
Subject:
|
Re: problem viewing full text |
andrewbontrager,
My Unix computer shows that your file 282.txt is (using the `file' command):
Non-ISO extended-ASCII English text, with CR line terminators
This is highly non-standard. Something is wrong with your file. A
www search for the string “Non-ISO extended-ASCII English text” brings: “I realized that the majority of the file is in ISO-8859-1 and some parts in utf-8.…”
http://stackoverflow.com/questions/5901633/perl-file-encoding-and-word-comparison
I will soon check your file and offer suggestions.
This post was modified by aibek on 2012-11-08 07:35:18
Poster:
|
aibek |
Date:
|
Nov 8, 2012 12:23am |
Forum:
|
texts
|
Subject:
|
Re: problem viewing full text |
You have an invalid UTF-8 character in your file. Line 59 starts with:
What I claim as my improvement is
After that there is a 0x97 (octal \227) followed by two 0x0d (the Mac line termination). This sequence is invalid UTF-8.
The proper way to correct this error is to
replace the \227 with the proper character in UTF-8 encoding. The no-brainer way is to put a hyphen there, making the file ASCII. Or you may put an en-dash or an em-dash or a quotation-dash. In all the above cases, the file would be valid UTF-8; plain ASCII is, obviously, valid UTF-8.
Try to find out how you got such an exotic character, so that it does not happen with random files!
---
Output of `isutf8':
282.txt: line 1, char 1, byte offset 2913: invalid UTF-8 code
Valid UTF-8 codes:
– en-dash (U+2013)
— em-dash (U+2014)
― quotation-dash (U+2015)
(You can copy these from this page.) To know which one to use, consult your style guide! Check this, though:
https://en.wikipedia.org/wiki/Dash
---
Note that the culprit character may not be visible in normal text editors even though it is there. (Some editors will even refuse to open the file.) Emacs, vim and hex editors show the character.
This post was modified by aibek on 2012-11-08 08:15:36
This post was modified by aibek on 2012-11-08 08:23:53
Poster:
|
aibek |
Date:
|
Nov 8, 2012 1:41am |
Forum:
|
texts
|
Subject:
|
Re: problem viewing full text |
I found out how you got the character.
Either (i) your text editor is set to save files in ANSI aka Windows-1252 aka cp1252. In this encoding 0x97 is the value for an em-dash. (See the link.)
Or, (ii) you copied your em-dash from a file in the Windows-1252 encoding and pasted in you text file.
(The effect of both the steps is the same -- your file is effectively in Win-1252 encoding.)
The solution to your problem is simple: convert the file from Win-1252 encoding to UTF-8 encoding. There are tools to do it automatically in Unix (the best is `iconv'). Or do it online (see the link below). Or, if you provide a collection of all the text files to me, I will convert it for you -- it is trivial on my machine.
So that the problem does not happen in future, set your text editor to use UTF-8 encoding. Win-1252 is deprecated. (See link.) Or, if the (ii) above is true, stop using your ‘master’-em-dash (which you copy to the files you edit.)
http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WindowsBestFit/bestfit1252.txt
http://kanjidict.stc.cx/recode.php
https://en.wikipedia.org/wiki/Code_page#Windows_.28ANSI.29_code_pages
This post was modified by aibek on 2012-11-08 09:41:27
Poster:
|
bravekami |
Date:
|
Aug 23, 2016 6:25am |
Forum:
|
texts
|
Subject:
|
Re: problem viewing full text |
Per Minute Charges for Viewing Full Text Documents Range from. Depending on the type of content you are searching, viewing the full text of an individual document will cause the Results Groups panel on the left to display different groups. Depending on the type of content you are searching, viewing the full text of an individual document will cause the Results Groups panel on the left to display different groups.
http://converteditpdftoworddocconverterfree.blogspot.com/