Commit Graph

304 Commits

Author SHA1 Message Date
Klondike Dragon
7a3c923820 Fix mm.dd.yyyy (time) format 2023-12-18 23:14:08 -07:00
Klondike Dragon
89df0f8c49 Comprehensive time validation 2023-12-18 20:52:16 -07:00
Klondike Dragon
a45d593447 Optimize checks for day of week and full month
Reduces CPU usage on large benchmarks by ~2%-3% and prepares for future with international month names in future.
2023-12-16 23:40:14 -07:00
Klondike Dragon
fbf07cc274 Optimize memory for error case
New option SimpleErrorMessages that avoids allocation in the error path. It's off by default to preserve backwards compatibility.

Added benchmark BenchmarkBigParseAnyErrors that takes the big set of test cases, and injects errors to make them fail at pseudo-random places.

This optimization speeds up the error path runtime by 4x and reduces error path allocation bytes by 13x!
2023-12-16 23:28:15 -07:00
Klondike Dragon
d2e1443c4d Comprehensive date format validation
Audit every stateDate so every unexpected alternative will fail.

In the process, fixed some newly found bugs:
* Extend format yyyy-mon-dd to allow times to follow it. Also allow full month name.
* Allow full day name before month (e.g., Monday January 4th, 2017)

Relevant confirmatory test cases were added.
2023-12-16 22:31:48 -07:00
Klondike Dragon
23f8fa1af0 Further optimize ambiguous parsing
Optimize the common and special case where mm and dd are the same length, just swap in place. Avoids having to reparse the entire string.

For this case, it's about 30% faster and reduces allocations by about 15%.

This format is especially common, hence the reason to optimize for this case.

Also fix the case for ambiguous date/time in the mm:dd:yyyy format.
2023-12-16 13:52:00 -07:00
Klondike Dragon
ed5310d0c1 Optimize ambiguous date parsing
Previously, for ambiguous date strings, it was always calling parse twice even when the first parse would have been successful.

Refactor so that parsing isn't re-attempted unless the first parse fails ambiguously.

Benchmark results show that with RetryAmbiguousDateWithSwap(true), it's now about 6.5% faster (ns/op) and reduces allocated bytes by 3.4%.
2023-12-16 12:55:03 -07:00
Klondike Dragon
f4307ef59d Heavily optimize memory allocations
Uses a memory pool for parser struct and format []byte

Uses a new go 1.20 feature to avoid allocations for []byte to string conversions in allowable cases.

go 1.20 also fixes a go bug for parsing fractional sec after a comma, so we can eliminate a workaround.

The remaining allocations are mostly unavoidable (e.g., time.Parse constructing a FixedZone location or part to strings.ToLower).

Results show an 89% reduction in allocated bytes for the big benchmark cases, and for some formats an allocation can be avoided entirely.

There is also a resulting 26% speedup in ns/op.

Details:

BEFORE:

cpu: 12th Gen Intel(R) Core(TM) i7-1255U
BenchmarkShotgunParse-12               19448 B/op        474 allocs/op
BenchmarkParseAny-12                    4736 B/op         42 allocs/op
BenchmarkBigShotgunParse-12          1075049 B/op      24106 allocs/op
BenchmarkBigParseAny-12               241422 B/op       2916 allocs/op
BenchmarkBigParseIn-12                244195 B/op       2984 allocs/op
BenchmarkBigParseRetryAmbiguous-12    260751 B/op       3715 allocs/op
BenchmarkShotgunParseErrors-12         67080 B/op       1679 allocs/op
BenchmarkParseAnyErrors-12             15903 B/op        200 allocs/op

AFTER:

BenchmarkShotgunParse-12               19448 B/op        474 allocs/op
BenchmarkParseAny-12                      48 B/op          2 allocs/op
BenchmarkBigShotgunParse-12          1075049 B/op      24106 allocs/op
BenchmarkBigParseAny-12                25394 B/op        824 allocs/op
BenchmarkBigParseIn-12                 28165 B/op        892 allocs/op
BenchmarkBigParseRetryAmbiguous-12     37880 B/op       1502 allocs/op
BenchmarkShotgunParseErrors-12         67080 B/op       1679 allocs/op
BenchmarkParseAnyErrors-12              3851 B/op        117 allocs/op
2023-12-16 10:48:24 -07:00
Klondike Dragon
0d2fd5e275 Add broader benchmarks
Uses the main test set for a broader stress test.
2023-12-16 08:18:54 -07:00
Klondike Dragon
0c3943eacd Support RabbitMQ log format (dd-mon-yyyy::hh:mm:ss)
Adapt https://github.com/araddon/dateparse/pull/122 by https://github.com/bizy01 to add support for RMQ log format. Refactor to avoid redundant code. Add format validations.

As a side note, will also support the format dd-mm-yyyy:hh:mm:ss.
2023-12-15 20:22:47 -07:00
Klondike Dragon
249dd7368c Support git log format (Thu Apr 7 15:13:13 2005 -0700)
Adapt commit 99d9682a1c from https://github.com/araddon/dateparse/pull/92 by https://github.com/jiangxin (merge timeWsYearOffset case and validate format)
2023-12-15 17:42:07 -07:00
Klondike Dragon
18ec8c69f6 Expand Chinese date format support
Inspired by https://github.com/araddon/dateparse/pull/132 from https://github.com/xwjdsh -- made this more general to all time formats that could follow, and added format validation.

Also include the related README.md touchup from https://github.com/araddon/dateparse/pull/136
2023-12-15 17:14:03 -07:00
Klondike Dragon
cc63421875 Support times after yyyy.mm.dd dates
Fix for this bug mentioned in https://github.com/araddon/dateparse/pull/134

Also, the other cases mentioned in this PR are not valid formats, so add them to the TestParseErrors test, to document that this is expected.
2023-12-14 23:47:31 -07:00
Klondike Dragon
23869f345e Add support for mm/dd/yyyy, hh:mm:ss
Incorporate PR https://github.com/araddon/dateparse/pull/156 from https://github.com/BrianLeishman and adapt to also validate the format
2023-12-14 23:14:26 -07:00
Klondike Dragon
14fb9398e4 Fix parsing for format (time) UTC[+-]NNNN
Fixes https://github.com/araddon/dateparse/issues/158
2023-12-14 22:57:42 -07:00
Klondike Dragon
d05b099ca6 Add better timezone explanation to README.md
How golang parses date strings with respect to time zones and locations can be really confusing. Document the key points that need to be understood to properly interpret the results of parsing arbitrary date strings, which may or may not have explicit time zone name or offset information include the parsed date string.
2023-12-14 00:00:36 -07:00
Klondike Dragon
2b3f700718 Handle format "date time (MST)"
Was unable to handle standalone timezone in parentheses before.

Also update tests to indicate expected timezone name for all tests that are parsed in a specific location.

With updated logic/fixes, add tests to verify:
* Fix https://github.com/araddon/dateparse/issues/71
* Fix https://github.com/araddon/dateparse/issues/72
2023-12-13 23:58:04 -07:00
Klondike Dragon
8f0059d6da Add tests to verify ambiguous cases
Test cases now validates the following is true:
* Fixed https://github.com/araddon/dateparse/issues/91
* Fixed https://github.com/araddon/dateparse/issues/28

(previous commits already addresses these issues, these tests ensure
that these issues remain fixed)
2023-12-12 23:40:07 -07:00
Klondike Dragon
df9ae2e32a Incorporate support for yyyymmddhhmmss.SSS
Incorporate PR https://github.com/araddon/dateparse/pull/144 from
https://github.com/dferstay to fix
https://github.com/araddon/dateparse/issues/143
2023-12-12 23:19:35 -07:00
Klondike Dragon
fc278d32da Incorporate support for dd-mm-yyyy (digit month)
Incorporate PR https://github.com/araddon/dateparse/pull/140 from
https://github.com/dferstay to fix
https://github.com/araddon/dateparse/issues/139

This also fixes https://github.com/araddon/dateparse/issues/155
(duplicate of issue 139)

PR is adapted to avoid duplicate code and validate format.
2023-12-12 23:07:11 -07:00
Klondike Dragon
18938f16ae Implement support for yyyy mon dd (2013 May 02)
Incorporate PR https://github.com/araddon/dateparse/pull/142 from
https://github.com/dferstay to fix
https://github.com/araddon/dateparse/issues/141
2023-12-12 21:42:48 -07:00
Klondike Dragon
301ffeee02 Add support for mon/dd/yyyy (Oct/31/1970) 2023-12-12 21:24:17 -07:00
Klondike Dragon
49f9259ee3 Add support for dd[th,nd,st,rd] Month yyyy
Incorporate PR https://github.com/araddon/dateparse/pull/128 from
https://github.com/krhubert to fix
https://github.com/araddon/dateparse/issues/127
2023-12-12 20:18:58 -07:00
Klondike Dragon
c62ed15d73 Support PMDT and AMT time zones
Also disallow PM and AM from being specified twice in the string.

Fixes https://github.com/araddon/dateparse/issues/149
2023-12-12 17:42:09 -07:00
Klondike Dragon
3ebc8bc635 Incorporate fix for dd.mm.yyyy format
Incorporates PR https://github.com/araddon/dateparse/pull/133 from https://github.com/mehanizm to fix https://github.com/araddon/dateparse/issues/129

Adds test cases to verify the following are already fixed:
* https://github.com/araddon/dateparse/issues/105
2023-12-11 23:46:44 -07:00
Klondike Dragon
1b1e0b3d33 Add extensive format validation, bugfixes
* Don't just assume we were given one of the valid formats.
* Also consolidate the parsing states that occur after timePeriod.
* Add subtests to make it easier to see what fails.
* Additional tests for 4-char timezone names.
* Fix https://github.com/araddon/dateparse/issues/117
* Fix https://github.com/araddon/dateparse/issues/150
* Fix https://github.com/araddon/dateparse/issues/157
* Fix https://github.com/araddon/dateparse/issues/145
* Fix https://github.com/araddon/dateparse/issues/108
* Fix https://github.com/araddon/dateparse/issues/137
* Fix https://github.com/araddon/dateparse/issues/130
* Fix https://github.com/araddon/dateparse/issues/123
* Fix https://github.com/araddon/dateparse/issues/109
* Fix https://github.com/araddon/dateparse/issues/98
* Addresses bug in https://github.com/araddon/dateparse/issues/100#issuecomment-1118868154

Adds test cases to verify the following are already fixed:
* https://github.com/araddon/dateparse/issues/94
2023-12-11 23:45:58 -07:00
Klondike Dragon
465140d619 Fix ineffective break statements 2023-12-08 18:31:28 -07:00
Arran Ubels
01b692d1ce
Another case. 2023-02-16 09:39:34 +11:00
Arran Ubels
b0b5409675
Unused code 2023-02-15 23:37:27 +11:00
Arran Ubels
8b765a5302
Skip white space 2023-02-15 23:37:23 +11:00
Arran Ubels
19ef6a25eb
New failure - still white space 2023-02-15 17:34:02 +11:00
Arran Ubels
b1fd89e43f
The only required one. 2023-02-15 16:31:37 +11:00
Arran Ubels
3a32cbb3d2
All of these did nothing 2023-02-15 16:28:56 +11:00
Arran Ubels
bf3a5b3040
Skip white space - to delete strategically 2023-02-15 16:27:43 +11:00
Arran Ubels
268a690081
So people don't have to check the string they can use the new errors.Is function 2023-02-15 16:26:18 +11:00
Arran Ubels
c5a1edc710
My addition last 2023-02-15 16:24:05 +11:00
Arran Ubels
544b5426f4
Test improvements.. I think 2023-02-15 16:20:46 +11:00
Arran Ubels
53a8cbdf09
Unnecessary bracket 2023-02-15 16:10:45 +11:00
Arran Ubels
094aad3f21
Commented code 2023-02-15 16:09:32 +11:00
Arran Ubels
c5b562ac1a
Added go releaser 2023-02-15 16:06:01 +11:00
Arran Ubels
eabb56b497
Text should be lowercase 2023-02-15 16:04:58 +11:00
Arran Ubels
515cd81767
S1023: redundant break statement (gosimple) 2023-02-15 16:02:13 +11:00
Arran Ubels
4345a38e91
Another error 2023-02-15 15:57:15 +11:00
Arran Ubels
cefe5b3dbe
More typo changes 2023-02-15 15:56:43 +11:00
Arran Ubels
e654ac7b35
Bug fixes. 2023-02-15 15:56:17 +11:00
Arran Ubels
a8e238d5d1
Go mod tidy 2023-02-15 15:55:15 +11:00
Arran Ubels
ad0ab84f6b
Lint action out of date. 2023-02-15 15:53:01 +11:00
Arran Ubels
57a1767ebd
SA4006: this value of err is never used (staticcheck) 2023-02-15 15:51:48 +11:00
Arran Ubels
5143d47e3e
S1023: redundant break statement (gosimple) 2023-02-15 15:51:14 +11:00
Arran Ubels
2fb4c46691
S1021: should merge variable declaration with assignment on next line (gosimple) 2023-02-15 15:50:42 +11:00