Wednesday, March 28, 2012

Strange character [Â] appearing in file.

Hi All,

I have an ASP.NET application which opens a text file and then saves it again.
This saved file is then passed into another software suite (not ASP.NET) for processing.
This secondary suite expects the lines in the file to be a certain length (legacy).

All text is transferred between the files correctly, except for where a uk pound sign [£] appears in the original file.

This character is NOT transferredusing the default StreamReader/Writer.

I havetriedchanging the encoding of both the StreamReader and the StreamWriter butcannot find one which does the job as expected, changing the reader toASCII and leaving the default writer is the closest but not withoutissues.
Where thiscondition occurs, the resulting saved file, in the place where theoriginal pound sign was displayed shows 2 (!) characters ?£.

This extra ? [alt + 0194] character is not seen by Notepad, not visiblewhen debugging, but is there, as it affects the fixed length recordprocessing mentioned above, and can be seen in other text editors suchas Wordpad, DocPad etc.

Please help, this is causing some despair!

Thanks and Regards,
Ric

The following code sample demonstrates my point...
[A file c:\testInput.txt is also required with a line in it that contains a £ sign]
eg. 111111111111111£1111111111
[A file c:\testOutput.txt is created to demonstrate the point.]

Sub Main()

'file access vars
Dim fsRead As FileStream
Dim fsWrite As FileStream
Dim sr As StreamReader
Dim sw As StreamWriter

'The file to read from
Dim inputFile As String = "c:\testInput.txt"
Dim outputFile As String = "c:\testOutput.txt"

'Create the FileStream : INPUT
Try
fsRead = New FileStream(inputFile, FileMode.Open, FileAccess.Read,FileShare.Read)
Catch ex As Exception
End Try

'Create the StreamReader
Try
sr = New StreamReader(fsRead, System.Text.Encoding.ASCII)
Catch ex As Exception
End Try

'Create the FileStream : OUTPUT
Try
fsWrite = New FileStream(outputFile, FileMode.OpenOrCreate,FileAccess.Write, FileShare.ReadWrite)
Catch ex As Exception
End Try

'Create the StreamWriter
Try
sw = New StreamWriter(fsWrite)
Catch ex As Exception
End Try

Dim s As String
Dim replaced As String
While sr.Peek <> -1

'read a line form the input file
s = sr.ReadLine()

'substitute the misread ? characters into £ signs
replaced = Replace(s, "?", "£")

'move to the end of the file
sw.BaseStream.Seek(0, SeekOrigin.End)

'and write the string to the output file
sw.Write(replaced & vbCrLf)

End While

'Clear up...
Try
sw.Flush()
sw.Close()
fsWrite.Close()

sr.Close()
fsRead.Close()

Catch ex As Exception
End Try

'and dispose
sw = Nothing
sr = Nothing
fsRead = Nothing
fsWrite = Nothing

End Sub

Solution:

When reading/writing plain text files using file streams on a Windows box you should specify an Encoding.

The Encoding can be created using the following code:
[1252 is the Western Windows codepage]

Dim encAs System.Text.Encoding
enc = System.Text.Encoding.GetEncoding(1252)

You can then specify this encoding to be used when creating the StreamReader/StreamWriter.

All characters (even the uk pound sign £) will then be correctly read/written by the streams.

Hope this helps,
Ric

0 comments:

Post a Comment