You're probably seeing those artifacts because your model doesn't see those pixels immediately outside your tile and so can't know how to "blend" things. (I'm assuming your tiles have a stride equal to the input size)
A typical approach I've seen used (and used myself) is, at inference time, to keep only a central portion of each tile and then to have overlapping windows so you can "fill the space".
The more overlap (and less of the output you use) the less apparent the artifacts will be (or have been in my experience), but more computation is needed.
Here's what I mean in cartoon form:
Current situation(as I assume):
input tile1 #####----------
input tile2 -----#####-----
output tile1 ##########--------------------
output tile2 ----------##########----------
My suggestion:
Here, for the output # means the value there is copied to the output,
and * means that the model makes a prediction there but it's not used.
input tile1 #####----------
input tile2 ---#####-------
output tile1 **######**--------------------
output tile2 ------**######**--------------
If you go with this idea, you could even simplify the model so that it doesn't attempt to make predictions at locations you would ignore anyway.
And please comment if things need clarification.